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HUMAN NEUTRALIZING MONOCLONAL ANTIBODIES 
TO HUMAN IMMUNODEFICIENCY VIRUS 

Technical Field 

The present invention relates generally to the field of 
immunology and specifically to human monoclonal antibodies 
which bind and neutralize human immunodeficiency virus (HIV) . 

Background 

1. HIV Immunotherapy 

HIV is the focus of intense studies as it is the 
causative agent for acquired immunodeficiency syndrome 
(AIDS) . Immunotherapeutic methods are one of several 
approaches to prevention, cure or remediation of HIV 
infection and HIV-induced diseases. Specifically, the use of 
neutralizing antibodies in passive immunotherapies is of 
central importance to the present invention. 

Passive immunization of HIV-1 infected humans using human 
sera containing polyclonal antibodies immunoreactive with HIV 
has been reported. See for example, Jackson et al . , Lancet . 
September 17:647-652, (1988); Karpas et al., Proc. Natl. 
Acad. Sci. . USA. 87:7613-7616 (1990). 

Numerous groups have reported the preparation of human 
monoclonal antibodies that neutralize HIV isolates in vitro . 
The described antibodies typically have immunospecif icities 
for epitopes on the HIV glycoprotein gpl20 or the related 
external surface envelope glycoprotein gpl20 or" the 
transmembrane glycoprotein gp41. See, for example Levy, 
Micro. Rev. . 57:183-289 (1993); Karwowska et al . , Aids 
Research and Human Retroviruses. 8:1099-1106 (1992); Takeda 
et al., J- Clin. Invest. . 89:1952-1957 (1992); Tilley et al . , 
Aids Research and Human Retroviruses. 8:461-467 (1992); 
Laman et al., J. Virol. . 66:1823-1831 (1992); Thali et al . , 
J. Virol . . 65:6188-6193 (1991); Ho et al . , Proc. Natl. Acad. 
Sci. USA . 88:8949-8952 (1991); D'Souza et al . , AIDS . 5:1061- 
1070 (1991); Tilley et al . , Res. Virol. . 142:247-259 
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(1991); Broliden et al . , Immunol . . 73:371-376 (1991); 
Matour et al . , J. Immunol . , 146:4325-4332 (1991); and 
Gomy et al . , Proc. Natl. Acad. Sci., USA , 88:3238-3242 

(1991) . 

5 To date, none of the reported human monoclonal 

antibodies have been shown to be effective in passive 
immunization therapies. Further, as monoclonal 
antibodies, they all each react with an individual epitope 
on the HIV envelope glycoprotein, gpl20 or gp!60. The 

10 epitope against which an effective neutralizing antibody 

immunoreacts has not been identified. 

There continues to be a need to develop human 
monoclonal antibody preparations with significant HIV 
neutralization activity. In addition, there is a need for 

15 monoclonal antibodies immunoreactive with additional and 

diverse neutralizing epitopes on HIV gp!20 and gp41 in 
view of recent studies suggesting that gpl20 and gp41 are 
involved in both binding of the HIV virus to the cell as 
well as in post binding events including envelope shedding 

20 and cleavage. See, for review, Levy, Micro. Rev., 57:183- 

289 (1993) . Additional (new) epitope specificities are 
required because, upon passive immunization, the 
administered patient can produce an immune response 
against the administered antibody, thereby inactivating 

25 the particular therapeutic antibody. 

2 . Human Monoclonal Antibod ies Produced From 

Combinatorial P haaemid Libraries 
The use of filamentous phage display vectors, referred 
30 to as phagemids, has been repeatedly shown to allow the 

efficient preparation of large libraries of monoclonal 
antibodies having diverse and novel immunospecif icities . 
The technology uses a filamentous phage coat protein 
membrane anchor domain as a means for linking gene-product 
3 5 and gene during the assembly stage of filamentous phage 

replication, and has been used for the cloning and 
expression of antibodies from combinatorial libraries. 
Kang et al . , Proc. Natl .- Acad . Sci., USA, 88:4363-4366 
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(1991) . Combinatorial libraries of antibodies have been 
produced using both the cpVIII membrane anchor (Kang et 
al., supra ) and the cpiii membrane anchor. Barbas et al., 
Proc. Natl. Acad. Sci . . USA : 88:7978-7982 (1991). 

The diversity of a filamentous phage-based 
combinatorial antibody library can be increased by 
shuffling of the heavy and light chain genes (Kang et al., 
Proc. Na tl. Acad. Sci.. USA . 88:11120-11123 (1991)), by 
altering the CDR3 regions of the cloned heavy chain genes 
of the library (Barbas et al . , Proc. Natl. Acad. Sci. . 
UShi 89:4457-4461 (1992)), and by introducing random 
mutations into the library by error-prone polymerase chain 
reactions (PCR) [Gram et al., Proc. Natl. Acad. Sci., USA . 
89:3576-3580 (1992) ] . 

Filamentous phage display vectors have also been 
utilized to produce human monoclonal antibodies 
immunoreactive with hepatitis B virus (HBV) or HIV 
antigens. See, for example Zebedee et al., Proc. Natl. 
Acad. Sci.. USA . 89:3175-3179 (1992); and Burton et al . , 
Proc. Natl. Acad. Sci.. USA . 88:10134-10137 (1991), 
respectively. None of the previously described human 
monoclonal antibodies produced by phagemid vectors that 
are immunoreactive with HIV have been shown to neutralize 
HIV. 

In particular, none of the previously-described human 
monoclonal antibodies produced by phagemid vectors are 
capable of neutralizing a majority of the field isolates 
of HIV. It is believed that certain of the antibodies 
described herein are particularly effective at 
neutralizing HIV because the antibodies immunoreact with 
an important antigenic determinant present on "mature" 
gpl20 and not present on the HIV precursor protein gpl60. 

Brief Description of the Invention 

Methods have now been discovered using the phagemid 
vectors to identify and isolate from combinatorial 
libraries human monoclonal antibodies that neutralize HIV, 
and allow the rapid preparation of large numbers of 
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neutralizing antibodies of completely human derivation. 
The identified neutralizing antibodies define new epitopes 
on the HIV gpl20 and gp41 glycoproteins, thereby 
increasing the availability of new immunotherapeutic human 
5 monoclonal antibodies. 

The invention provides human monoclonal antibodies that 
neutralize HIV, and also provides cell lines used to 
produce these monoclonal antibodies. 

Also provided are amino acid sequences which confer 

10 neutralization function to the antigen binding domain of a 

monoclonal antibody, and which can be used immunogenically 
to identify other antibodies that specifically bind and 
neutralize HIV. The monoclonal antibodies of the 
invention find particular utility as reagents for the 

15 diagnosis and immunotherapy of HIV-induced disease. 

A major advantage of the monoclonal antibodies of the 
invention derives from the fact that they are encoded by a 
human polynucleotide sequence. Thus, in vivo use of the 
monoclonal antibodies of the invention for diagnosis and 

20 immunotherapy of HIV-induced disease greatly reduces the 

problems of significant host immune response to the 
passively administered antibodies which is a problem 
commonly encountered when monoclonal antibodies of 
xenogeneic or chimeric derivation are utilized. 

25 An additional major advantage of a preferred group of 

monoclonal antibodies described herein derives from the 
fact that they immunoreact with a unique determinant 
present on mature HIV glycoprotein gpl20. This class of 
antibodies is particularly effective at neutralizing field 

30 isolates of HIV. 

In one embodiment, the invention contemplates a human 
monoclonal antibody capable of immunoreact ing with human 
immunodeficiency virus (HIV) glycoprotein gpl2 0 and 
neutralizing HIV. A preferred human monoclonal antibody 

35 has the binding specificity of a monoclonal antibody 

comprising a heavy chain immunoglobulin variable region 
amino acid residue sequence selected from the group 
consisting of SEQ ID Nos 66, 67, 68, 70, 72, 73, 74, 75, 



WO 96/02273 PCT/US95/08743 

- 5 - 

78 and 97. 

In a particularly preferred embodiment, the invention 
describes a human monoclonal antibody capable of 
immunoreacting with human immunodeficiency virus (HIV) 
glycoprotein gpl20 and neutralizing HIV, wherein the 
monoclonal antibody has the capacity to reduce HIV 
infectivity titer in an in vitro virus infectivity assay 
by 50% at a concentration of less than 700 nanograms (ng) 
of antibody per milliliter (ml) . 

Preferably, an anti-gpl20 monoclonal antibody of this 
invention binds mature gpl20 preferentially over HIV 
precursor glycoprotein gpl60. More preferably, an anti- 
gpl20 monoclonal antibody binds to a V1/V2 loop deficient - 
variant gpl20 substantially less than native gpl20, 
thereby defining a important epitope for the antibody. 
Human monoclonal antibodies having these properties are 
particularly useful at neutralizing field isolates, and 
therefore provide useful information regarding the 
immunocompetence of an immune response in HIV-infected 
patients. 

Therefore, the invention provides for a screening 
method to determine whether HIV-infected patients contain 
antibodies of the class that neutralize field isolates. 
The method for determining immunocompetence of a human 
ant i -human immunodeficiency virus (HIV) antibody in a 
sample comprises the steps of : 

(1) contacting a sample believed to contain a 
human anti-HIV antibody with a diagnostically effective 
amount of the above -described anti-gpl20 monoclonal 
antibody in a competition immunoreaction admixture 
containing mature gpl20 in the solid phase; 

(2) maintaining the competition immunoreaction 
admixture under conditions sufficient for the monoclonal 
antibody to bind with the gpl20 in the solid phase and 
form a solid phase immunoreactant ; and 

(3) detecting the amount of the immunoreactant 
present in the solid phase, and thereby the 
immunocompetence of any human anti-HIV antibody in the 
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sample . 

Another preferred human monoclonal antibody has the 
binding specificity of a monoclonal antibody comprising a 
light chain immunoglobulin variable region amino acid 
5 residue sequence selected from the group consisting of SEQ 

ID Nos 95, 96, 97, 98, 101, 102, 103, 104, 105, 107, 110, 
115, 118, 121, 122, 124 and 132. 

In a further embodiment, the invention contemplates a 
human monoclonal antibody capable of immunoreacting with 

10 human immunodeficiency virus (HIV) glycoprotein gp41 and 

neutralizing HIV. A preferred human monoclonal antibody 
has the binding specificity of a monoclonal antibody 
comprising a heavy chain immunoglobulin variable region 
amino acid residue sequence selected from the group 

15 consisting of SEQ ID Nos 142, 143, 144, 145 and 146. 

Another preferred human monoclonal antibody has the 
binding specificity of a monoclonal antibody comprising a 
light chain immunoglobulin variable region amino acid 
residue sequence selected from the group consisting of SEQ 

20 ID NOs 147, 148, 149, 150 and 151. 

In another embodiment, the invention describes a 
polynucleotide sequence encoding a heavy or light chain 
immunoglobulin variable region amino acid residue sequence 
portion of a human monoclonal antibody of this invention. 

25 Also contemplated are DNA expression vectors containing 

the polynucleotide, and host cells containing the vectors 
and polynucleotides of the invention. 

The invention also contemplates a method of detecting 
human immunodeficiency virus (HIV) comprising contacting a 

30 sample suspected of containing HIV with a diagnostically 

effective amount of the monoclonal antibody of this 
invention, and determining whether the monoclonal antibody 
immunoreacts with the sample. The method can be practiced 
in vitro or in vivo , and may include a variety of methods 

35 for determining the presence of an immunoreaction product. 

In another embodiment, the invention describes a method 
for providing passive immunotherapy to human 
immunodeficiency virus (HIV) disease in a human, 
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comprising administering to the human an 
immunotherapeutically effective amount of the monoclonal 
antibody of this invention. The administration can be 
provided prophylactically , and by a parenteral 
administration. Pharmaceutical compositions containing 
one or more of the different human monoclonal antibodies 
are described for use in the therapeutic methods of the 
invention. 

Brief Description of the Drawings 

In the drawings forming a portion of this disclosure: 

Figure 1 illustrates the sequence of the 
double-stranded synthetic DNA inserted into Lambda Zap to 
produce a Lambda Hc2 expression vector. The preparation 
of the double- stranded synthetic DNA insert is described 
in Example la2) . The various features required for this 
vector to express the V H -coding DNA homologs include the 
Shine -Dalgarno ribosome binding site, a leader sequence to 
direct the expressed protein to the periplasm as described 
by Mouva et al. # J. Biol . Chem. . 255:27, 1980, and various 
restriction enzyme sites used to operatively link the V H 
homologs to the expression vector. The V H expression 
vector sequence also contains a short nucleic acid 
sequence that codes for amino acids typically found in 
variable regions heavy chain (V H backbone) . This V H 
backbone is just upstream and in the proper reading as the 
V H DNA homologs that are operatively linked into the Xho I 
and Spe I cloning sites. The sequences of the top and 
bottom strands of the double-stranded synthetic DNA insert 
are listed respectively in SEQ ID NO 1 and SEQ ID NO 2 . 
The ten amino acid sequence comprising the decapeptide tag 
is listed in SEQ ID NO 5. The synthetic DNA insert is 
direct ionally ligated into Lambda Zap II digested with the 
restriction enzymes Not 1 and Xho I to form Lambda Hc2 
expression vector. 

Figure 2 illustrates the major features of the 
bacterial expression vector Lambda Hc2 (V H expression 
vector) . The orientation of the insert in Lambda Zap II 
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is shown. The V H DNA homologs are inserted into the Xho I 
and Spe I cloning sites. The read through transcription 
produces the decapeptide epitope (tag) that is located 
just 3' of the cloning site. The amino acid residue 
sequence of the decapeptide tag and the Pel B leader 
sequence/spacer are respectively listed in SEQ ID NO 5 and 
6. 

Figure 3 illustrates the sequence of the double- 
stranded synthetic DNA inserted into Lambda Zap to produce 
a Lambda Lc2 expression vector. The various features 
required for this vector to express the V L -coding DNA 
- homologs are described in Figure 1. The V L -coding DNA 
homologs are operatively linked into the Lc2 sequence at 
the Sac I and Xho I restriction sites. The sequences of 
the top and bottom strands of the double- stranded 
synthetic DNA insert are listed respectively in SEQ ID NO 
3 and SEQ ID NO 4 . The synthetic DNA insert is 
directionally ligated into Lambda Zap II digested with the 
restriction enzymes Sac I and Not I to form Lambda Lc2 
expression vector. 

Figure 4 illustrates the major features of the 
bacterial expression vector Lc2 (V L expression vector) . 
The synthetic DNA sequence from Figure 3 is shown at the 
top along with the Lac Z promoter from Lambda Zap II. The 
orientation of the insert in Lambda Zap II is shown. The 
V L DNA homologs are inserted into the Sac I and Xho I 
cloning sites. The amino acid residue sequence of the Pel 
B leader sequence/spacer is listed in SEQ ID NO 7. 

Figure 5 illustrates the dicistronic expression vector, 
pComb, in the form of a phagemid expression vector. 

Figure 6 illustrates the neutralization of HIV-1 by 
recombinant Fabs. The same supernate preparations were 
used in p24 and syncytia assays. The figures indicate 
neutralization titers. Refer to Example 3 for details of 
the assay procedures and discussion of the results. The 
EL ISA titers and Fab concentrations were determined as 
described in Example 2b. 
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Figure 7 illustrates the relative affinities of Fab 
fragments for gpl20 (IIIB) as illustrated by inhibition 
ELISA performed as described in Example 2b6) . Fabs 27, 6, 
29, 2 and 3 are all prototype members of the different 
groups discussed in Example 9a. Loop 2 is an Fab fragment 
selected from the same library as the other Fabs but which 
recognizes the V3 loop. The data is plotted as the 
percentage of maximum binding on the Y-axis against 
increasing concentrations {10* 11 M to 10' 7 M) of soluble 
gpl20 on the X-axis. 

Figure 8 illustrates the soluble CD4 competition with 
Fab fragments for gpl20 (IIIB) . P4D10 and loop2 are 
controls. P4D10 is a mouse monoclonal antibody reacting 
with the V3 loop of gpl20 (IIIB) . The data, discussed in 
Example 2b6) , is plotted as described in Figure 7. 

Figure 9 illustrates the neutralization of HIV by 
purified Fabs prepared as described in Example 3 . The 
results shown are derived from the syncytia assay using 
the MN strain. The data is plotted as percent of 
inhibition of binding on the Y-axis against increasing Fab 
concentrations [0.1 to greater than 10 
micrograms/milliliter (/ig/ml)] on the X-axis. 

Figures 10A and 10B illustrate the amino acid residue 
sequences of variable heavy (V H ) domains of Fabs binding to 
gpl20. Seven distinct groups have been identified as 
described in Example 9a based on sequence homology. 
Identity with the first sequence in a group is indicated 
by dots. The Fab clone names are indicated in the left 
hand column. The corresponding SEQ ID Nos are indicated 
in the right hand column. The sequenced regions from 
right to left are framework region 1 (FR1) , complementary 
determining region 1 (CDR1) , framework region 2 (FR2) , 
complementary determining region 2 (CDR2) , framework 
region 3 (FR3), complementary determining region 3 (CDR3), 
and framework region 4 (FR4) . The five amino- terminal 
residue sequence beginning with LEQ arises from the VHla 
while the 5 amino-terminal residue sequence beginning with 
LEE arises from the VH3a primers. The bll and b29 
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sequences are very similar to the b3 group and could be 
argued to be intraclonal variants within that group; they 
are placed in their own group because of differences at 
the V-D and D-J interface. 
5 Figures 11A and 11B illustrate the amino acid residue 

sequences of variable light (V L ) domains of Fabs binding to 
gpl20. Refer to Figures 10A and 10B for the description 
of the figure and to Example 9b for analysis of the 
sequences . 

10 Figures 12A and 12B illustrate the amino acid residue 

sequences of V L domains from Fabs binding to gpl20 and 
generated by shuffling the heavy chain from clone bl2 
against a library of light chains (H12-LCn Fabs) as 
described in Example 10. Note that the new V L sequences 

15 have designated clone numbers that do not relate to those 

numbers from the original library. The unique sequences 
are listed in the Sequence Listing from SEQ ID NO 114 to 
122. The new V L domain sequences are compared to that of 
the original clone bl2 V L sequence. 

20 Figures 13A and 13B illustrate the amino acid residue 

sequences of V H domains from Fabs binding to gpl20 and 
generated by shuffling the light chain from clone bl2 
against a library of heavy chains {L12-HCn Fabs) as 
described in Example 10. Note that the new V H sequences 

25 have designated clone numbers that do not relate to those 

numbers from the original library. The unique sequences 
are listed in the Sequence Listing from SEQ ID NO 123 to 
132. The new V H domain sequences are compared to that of 
the original clone bl2 V H sequence. 

30 Figures 14A and 14B illustrate plasmid maps of the 

heavy (pTACOlH) and light chain (pTCOl) 

replicon-compatible chain-shuffling vectors, respectively. 
Both plasmids are very similar in the section containing 
the promoter and the cloning site. Abbreviations: tacPO, 
3 5 tac promoter /operon; 5 histidine amino acid residue tag 

(histidine)S-tail; fllG, intergenic region of fl-phage; 
stu, stuff er fragment ready for in- frame replacement by 
light and heavy chain, respectively; cat, chloramphenicol 
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transferase gene; bla, b- lactamase gene; ori, origin of 
replication. The map is drawn approximately to scale. 

Figures 15A and 15B illustrate the nucleotide sequences 
of the binary shuffling vectors. The construction and use 
of the vectors is described in Example 11. In Figure 15A, 
the double-stranded nucleotide sequence of the multiple 
cloning site in light chain vector, pTCOl, is shown. The 
sequences of the top and bottom nucleotide base strands 
are listed respectively in SEQ ID NO 8 and SEQ ID NO 9 . 
The amino acid residue sequence comprising the pelB leader 
ending in the Sac I restriction site is listed in SEQ ID 
NO 10. In Figure 15B, the nucleotide sequence of the 
multiple cloning site in heavy chain vector, pTACOlH, is 
shown. The sequences of the top and bottom nucleotide 
base strands are listed respectively in SEQ ID NO 11 and 
SEQ ID NO 12. The amino acid residue sequence comprising 
the pelB leader ending in the Xho I restriction site is 
listed as SEQ ID NO 13. The amino acid residue sequence 
comprising the histidine tail is listed in SEQ ID NO 14 . 
Relevant restriction sites are underlined. tac promoter 
and ribosome binding site (rbs) are indicated by boxes. 

Figure 16 illustrates the complete set of directed 
crosses between heavy and light chains of all Fab 
fragments isolated from the original library by panning 
with gpl60 (IIIB) (bl-b27) , gpl20 (IIIB) (B8-B35) , gpl20 
(SF2) <s4-s8) , and the loop peptide (p35) assayed by EL ISA 
against IIIB gpl20 as described in Example 11. Heavy 
chains are listed horizontally and light chains are listed 
vertically. Clones are sorted according to the grouping 
established in Example 9. Different groups are separated 
by horizontal and vertical lines. A " at the 
intersection of a particular heavy chain and light chain 
signifies a clear negative (a signal of 3 times background 
or less) for that particular cross, a M + n shows a clear 
positive comparable to the original heavy and light chain 
combination, and a n w w denotes an intermediate value in 
the ELISA. : the HCp35/ LCp35 combination is negative 

when gp!20 (IIIB) is used, but positive when assayed with 
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gpl20 (IIIB) . Identical chains carry the same identifier 
(either * # !, §, or Y) . 

Figure 17 illustrates the affinity of antibody- antigen 
interaction for bl2 heavy chain crosses with light chains 
5 from all pannings analyzed by competitive EL ISA using 

soluble IIIB gpl20 as competing antigen as described in 
Example 10. The data is plotted as the percentage of 
maximum binding on the Y-axis against increasing 
concentrations of soluble gpl20 (IIIB) <10' 12 M to 10 7 M) 

10 on the X-axis. 

Figures 18A and 18B illustrate the amino acid residue 
sequences of variable heavy (V H ) domains of Fabs binding to 
gp41. The Fab clone names are indicated in the left hand 
column. The heavy chain sequences of the five Fabs 

15 individually designated DL 41 19, DO 41 11, GL 41 1, MT 41 

12 and SS 41 8 have been assigned the respective SEQ ID 
Nos 142, 143, 144, 145 and 146. The sequenced regions 
from right to left are framework region 1 (FR1) , 
complementary determining region 1 (CDR1) , framework 

20 region 2 <FR2) , complementary determining region 2 (CDR2) , 

framework region 3 (FR3) , complementary determining region 
3 (CDR3), and framework region 4 (FR4) . 

Figures 19A and 19B illustrate the amino acid residue 
sequences of variable light (V L ) domains of Fabs binding to 

25 gp^l- Refer to Figures 18A and 18B for the description of 

the figure. The light chain sequences of the five Fabs 
individually designated DL 41 19, DO 41 11, GL 41 1, MT 41 
12 and SS 41 8 have been assigned the respective SEQ ID 
NOs 147, 148, 149, 150 and 151. 

30 Figure 20 illustrates the relative binding affinities 

of b3, b6, and bl2 for the total envelope glycoproteins 
(gplGO) and for the gpl20 glycoprotein (gpl20) expressed 
on the surface of COS-l cells as determined by 
immunoprecipitation and described in Example 6. The 

35 signal on the autoradiogram represents the relative amount 

of envelope glycoproteins bound with increasing 
concentrations of Fab (0-150 /xg/ml) . 
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Figure 21 illustrates the neutralization of HIV-1 by 
bl2 IgGl as assessed using PHA-stimulated PBMCs as 
indicator cells and determination of extracellular p24 as 
the reporter assay. Refer to Example 5d for details of 
the assay procedures and discussion of the results. The 
designation, location, and disease status of the virus 
donors were as follows: ■, VS (New York, acute), ▼, N70-2 
(New Orleans, asymptomatic), a, AC (San Diego, AIDS), •, 
LS (Los Angeles, AIDS), 0, NYC-A (New York, unknown), v, 
WM (Los Angeles, AIDS), a, RA (New York, acute), o, JP 
(New York, acute) . The molecularly cloned HIV-l virus JR- 
CSF (♦) and HIV-1 isolate JR-FL (o) were also assayed for 
neutralization. The data is plotted as % neutralization 
on the Y-axis against increasing concentrations of bl2 
IgGl (0-25 fig/ml) on the X-axis. 

Figure 22 illustrates the reactivity of bl2 IgGl with a 
panel of international isolates of HIV-1 as described in 
Example 8. Reactivity was determined with gpl20 isolated 
from the HIV-1 samples in EL ISA with the bl2 IgGl as 
described in Example 8. Data is plotted as % bl2 IgGl 
reactivity on the X-axis against clades A-F on the Y-axis. 
Country names indicate where the HIV-1 virus was 
originally isolated. The numbers in parenthesis refer to 
the number of viruses of each clade examined. Reactivity 
is designated as strong ($) or moderate (::) . 

Figure 23 illustrates the neutralization of the HXBc2 
molecular clone of HIV-1 LAI by purified Fabs and a 
monoclonal antibody 110.4 (Mab 110.4) in an envelope 
complementation assay as described in Example 3c. 
Neutralization of HXBc2 infectivity is expressed as a 
decrease in residual CAT activity. The data is plotted as 
% residual CAT activity on the Y-axis and increasing 
concentrations of Fab and MAb (0.1-20 /xg/ml) on the X- 
axis. 

Figure 24 illustrates the pSG- 5 - mammalian expression 
vector as described in Examples 4a and 4b. Transcription 
of the heavy or light chain gene when inserted in the 
EcoRI site is under the control of the SV40 early 
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promoter. Transcriptional termination is signaled by the 
SV40 polyadenylation signal sequence downstream of the 
heavy chain sequence. The M13 intergenic region allows 
for the production of single-stranded DNA for nucleotide 
5 sequence determination. The amp R gene is for selection of 

the vector in bacterial cells. 

Figures 25A and 25B illustrate the nucleotide and amino 
acid residue sequences of the bl2 light chain gene in the 
pSG-5 mammalian expression vector described in Example 4b. 

10 The bl2 light chain has been modified for expression in 

mammalian cells as described in Example 4b. 

Figure 26 illustrates pEe6HC BM12, the pEE6 mammalian 
expression vector with the b!2 IgGl heavy chain gene that 
has been modified for antibody expression in mammalian 

15 cells as described in Example 4d. The VH was originally 

derived from the Fab bl2 and has the same binding 
specificity as the Fab bl2. The pEE6 vector has a human 
CMV promoter for expression of the heavy chain, a 
polyadenylation signal for termination of transcription, 

20 and an ampicillin gene for selection in bacteria. 

Figures 27A through 27E illustrate the nucleotide 
sequences of the bl2 heavy chain VH and constant regions 
in the pEe6HC BM12 mammalian expression vector as 
described Example 4d. The amino acid residue sequence of 

25 the b!2 heavy chain VH is given. The bl2 VH has been 

modified for expression in mammalian cells as described in 
Example 4d. 

Figure 28 illustrates pEel2 Combo BM12, the pEE12 
mammalian expression vector with bl2 IgGl heavy and light 

30 chain genes that have been modified for antibody 

expression in mammalian cells as described in example 4f . 
The VH and light chain were originally derived from the 
Fab bl2 and have the same binding specificity as the Fab 
bl2. The pEE12 vector has a human CMV promoter for 

35 expression of the light chain, a polylinker to provide 

cloning sites, and a polyadenylation signal for 
termination of transcription. The vector also contains 
the GS selectable marker gene whose expression is 
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controlled an SV40 early promoter at the 5* end of the GS 
gene, an intron, and a polyadenylation signal at the 3' 
end of the GS gene. A heavy chain cassette comprising the 
HCMV promoter, enhancer elements, heavy chain gene, and 
polyadenylation signal were removed from the pEE6 vector 
and inserted into the pEE12 vector to generate the 
combinatorial construct containing both the bl2 light and 
heavy chain genes. 

Figure 29A through 29R illustrates the nucleotide 
sequence of the pEE12 mammalian expression vector and the 
bl2 IgGl heavy and light chain genes, pEel2 Combo BM 12, 
as described in Example 4f . The VH and light chain genes 
have been modified for expression in mammalian cells as 
described in Example 4 . 

Detailed Description of the Invention 
A. Definitions 



chemical digestion (hydrolysis) of a polypeptide at its 
peptide linkages. The amino acid residues described 
herein are preferably in the "L n isomeric form. However, 
residues in the "D" isomeric form can be substituted for 
any L-amino acid residue, as long as the desired 
functional property is retained by the polypeptide. NH 2 
refers to the free amino group present at the amino 
terminus of a polypeptide. COOH refers to the free 
carboxy group present at the 

carboxy terminus of a polypeptide. In keeping with 
standard polypeptide nomenclature {described in J. Biol. 
Chem. . 243:3552-59 (1969) and adopted at 37 CFR 
§1 . 822 (b) (2) ),. abbreviations for amino acid residues are 
shown in the following Table of Correspondence: 



Amino Acid Residue : An amino acid formed upon 



TABLE OF CORRESPONDENCE 



SYMBOL 



AMINO ACID 



3 -Letter 



G 



Y 



Tyr 
Gly 



tyrosine 
glycine 
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5 



F 


Phe 


phenylalanine 


M 


Met 


methionine 


A 


Ala 


alanine 


S 


Ser 


serine 


I 


He 


isoleucine 


L 


Leu 


leucine 


T 


Thr 


cnreonine 


V 


Val 


valine 


P 


Pro 


proline 


K 


Lys 


lysine 


H 


His 


histidine 


Q 


Gin 


glutamine 


£ 


Glu 


glutamic acid 


Z 


Glx 


Glu and/or Gin 


w 


Trp 


tryptophan 


R 


Arg 


arginine 


D 


Asp 


aspartic acid 


N 


Asn 


asparagine 


B 


Asx 


Asn and/or Asp 


C 


Cys 


cysteine 


X 


Xaa 


Unknown or other 



It should be noted that all amino acid residue 
sequences represented herein by formulae have a left- to- 
right orientation in the conventional direction of amino 

25 terminus to carboxy terminus. In addition, the phrase 

"amino acid residue" is broadly defined to include the 
amino acids listed in the Table of Correspondence and 
modified and unusual amino acids, such as those listed in 
37 CFR 1.822(b) (4), and incorporated herein by reference. 

30 Furthermore, it should be noted that a dash at the 

beginning or end of an amino acid residue sequence 
indicates a peptide bond to a further sequence of one or 
more amino acid residues or a covalent bond to an amino- 
terminal group such as NH^ or acetyl or to a carboxy- 

35 terminal group such as COOH. 

Recombinant DNA (rD NA) molecule: A DNA molecule 
produced by operatively linking two DNA segments. Thus, a 
recombinant DNA molecule is a hybrid DNA molecule 
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comprising at least two nucleotide sequences not normally 
found together in nature. RDNA'S not having a common 
biological origin, i.e., evolutionarily different, are 
said to be "heterologous". 

Vector: A RDNA molecule capable of autonomous 
replication in a cell and to which a DNA segment, e.g., 
gene or polynucleotide, can be operatively linked so as to 
bring about replication of the attached segment. Vectors 
capable of directing the expression of genes encoding for 
one or more polypeptides are referred to herein as 
"expression vectors". Particularly important vectors 
allow cloning of cDNA (complementary DNA) from mRNAs 
produced using reverse transcriptase. 

Receptor : A receptor is a molecule, such as a 
protein, glycoprotein and the like, that can specifically 
(non- randomly) bind to another molecule. 

Antibody : The term antibody in its various 
grammatical forms is used herein to refer to 
immunoglobulin molecules and immunologically active 
portions of immunoglobulin molecules, i.e., molecules that 
contain an antibody combining site or paratope. Exemplary 
antibody molecules are intact immunoglobulin molecules, 
substantially intact immunoglobulin molecules and portions 
of an immunoglobulin molecule, including those portions 
known in the art as Fab, Fab 1 , F(ab') 2 and F(v) . 

Antibody Combining Site: An antibody combining 
site is that structural portion of an antibody molecule 
comprised of a heavy and light chain variable and 
hypervariable regions that specifically binds 
(immunoreacts with) an antigen. The term immunoreact in 
its various forms means specific binding between an 
antigenic determinant -containing molecule and a molecule 
containing an antibody combining site such as a whole 
antibody molecule or a portion thereof. 

Monoclonal Antibody : A monoclonal antibody in its 
various grammatical forms refers to a population of 
antibody molecules that contain only one species of 
antibody combining site capable of immunoreact ing with a 
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particular epitope. A monoclonal antibody thus typically 
displays a single binding affinity for any epitope with 
which it immunoreacts . A monoclonal antibody may 
therefore contain an antibody molecule having a plurality 
5 of antibody combining sites, each immunospecif ic for a 

different epitope, e.g., a bispecific monoclonal antibody. 
Although historically a monoclonal antibody was produced 
by immortalization of a clonally pure immunoglobulin 
secreting cell line, a monoclonally pure population of 

10 antibody molecules can also be prepared by the methods of 

the present invention. 

Fusion Polypeptide : A polypeptide comprised of at 
least two polypeptides and a linking sequence to 
operatively link the two polypeptides into one continuous 

15 polypeptide. The two polypeptides linked in a fusion 

polypeptide are typically derived from two independent 
sources, and therefore a fusion polypeptide comprises two 
linked polypeptides not normally found linked in nature. 
Upstream : In the direction opposite to the 

20 direction of DNA transcription, and therefore going from 

5 ! to 3 1 on the non-coding strand, or 3 1 to 5 1 on the 
mRNA. 

Downstream : Further along a DNA sequence in the 
direction of sequence transcription or read out, that is 
25 traveling in a 3 1 - to 5 '-direction along the non-coding 

strand of the DNA or 5'- to 3' -direction along the RNA 
transcript. 

Cistron : Sequence of nucleotides in a DNA 
molecule coding for an amino acid residue sequence and 
3 0 including upstream and downstream DNA expression control 

elements . 

Leader Polypeptide : A short length of amino acid 
sequence at the amino end of a polypeptide, which carries 
or directs the polypeptide through the inner membrane and 
3 5 so ensures its eventual secretion into the periplasmic 

space and perhaps beyond. The leader sequence peptide is 
commonly removed before the polypeptide becomes active. 

Reading Frame : Particular sequence of contiguous 
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nucleotide triplets (codons) employed in translation. The 

reading frame depends on the location of the translation 
initiation codon. 

B. Human Monoclonal Antibodies 

The present invention relates to human mono- 
clonal antibodies which are specific for, and neutralize 
human immunodeficiency virus (HIV) . In a preferred 
embodiment of the invention, human monoclonal antibodies 
are disclosed which are capable of binding epitopic 
polypeptide sequences in glycoprotein gpl20 of HIV. A 
further preferred embodiment are human monoclonal 
antibodies capable of binding epitopic polypeptide 
sequences in glycoprotein gp 41 of HIV. Also disclosed is 
an antibody having a specified amino acid sequence, which 
sequence confers the ability to bind a specific epitope 
and to neutralize HIV when the virus is bound by these 
antibodies. A human monoclonal antibody with a claimed 
specificity, and like human monoclonal antibodies with 
like specificity, are useful in the diagnosis and 
immunotherapy of HIV- induced disease. 

The term "HIV- induced disease" means any disease 
caused, directly or indirectly, by HIV. An example of a 
HIV- induced disease is acquired autoimmunodef iciency 
syndrome (AIDS) , and any of the numerous conditions 
associated generally with AIDS which are caused by HIV 
infection. 

Thus, in one aspect, the present invention is 
directed to human monoclonal antibodies which are reactive 
with a HIV neutralization site and cell lines which 
produce such antibodies. The isolation of cell lines 
producing monoclonal antibodies of the invention is 
described in great detail further herein, and can be 
accomplished using the phagemid vector library methods 
described herein, and using routine screening techniques 
which permit determination of the elementary 
immunoreaction and neutralization patterns of the 
monoclonal antibody of interest. Thus, if a human 
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monoclonal antibody being tested binds and neutralizes HIV 
in a manner similar to a human monoclonal antibody 
produced by the cell lines of the invention then the 
tested antibody is considered equivalent to an antibody of 
5 the invention. 

It is also possible to determine, without undue 
experimentation, if a human monoclonal antibody has the 
same (i.e., equivalent) specificity as a human monoclonal 
antibody of this invention by ascertaining whether the 

10 former prevents the latter from binding to HIV. If the 

human monoclonal antibody being tested competes with the 
human monoclonal antibody of the invention, as shown by a 
decrease in binding by the human monoclonal antibody of 
the invention in standard competition assays for binding 

15 to a solid phase antigen, for example to gpl20, then it is 

likely that the two monoclonal antibodies bind to the 
same, or a closely related, epitope. 

Still another way to determine whether a human 
monoclonal antibody has the specificity of a human 

20 monoclonal antibody of the invention is to pre-incubate 

the human monoclonal antibody of the invention with HIV 
with which it is normally reactive, and then add the human 
monoclonal antibody being tested to determine if the human 
monoclonal antibody being tested is inhibited in its 

25 ability to bind HIV. If the human monoclonal antibody 

being tested is inhibited then, in all likelihood, it has 
the same, or functionally equivalent, epitopic specificity 
as the monoclonal antibody of the invention. Screening of 
human monoclonal antibodies of the invention, can be also 

30 carried out utilizing HIV neutralization assays and 

determining whether the monoclonal antibody neutralizes 
HIV. 

The ability to neutralize HIV at one or more 
stages of virus infection is a desirable quality of a 
35 human monoclonal antibody of the present invention. Virus 

neutralization can be measured by a variety of in vitro 
and in vivo methodologies. Exemplary methods described 
herein for determining the capacity for neutralization are 
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the in vitro assays that measure inhibition of HIV- induced 
syncytia formation, plaque assays and assays that measure 
the inhibition of output of core p24 antigen from a cell 
infected with HIV. 

As shown herein, the immunospecif icity of a human 
monoclonal antibody of this invention can be directed to 
epitopes that are shared across serotypes and/or strains 
of HIV, or can be specific for a single strain of HIV, 
depending upon the epitope. Thus, a preferred human 
monoclonal antibody can immunoreact with HIV-l, HIV- 2, or 
both, and can immunoreact with one or more of the HIV-l 
strains I I IB, MN, RF, SF-2, Z2, 26, CDC4 , ELI and the like 
strains. In addition, a preferred human monoclonal 
antibody can immunoreact and neutralize a majority of 
field isolates of HIV, as described further herein. 

The immunospecificity of an antibody, its HIV- 
neutralizing capacity, and the attendant affinity the 
antibody exhibits for the epitope, are defined by the 
epitope with which the antibody immunoreacts. The epitope 
specificity is defined at least in part by the amino acid 
residue sequence of the variable region of the heavy chain 
of the immunoglobulin the antibody, and in part by the 
light chain variable region amino acid residue sequence. 
Preferred human monoclonal antibodies immunoreact with the 
CD4 binding site of glycoprotein gpl20. 

Also disclosed is an antibody having a specified 
amino acid sequence, which sequence confers the ability to 
bind a specific unique neutralizing epitope and to 
neutralize HIV when the virus is bound by these 
antibodies . 

A preferred human monoclonal antibody of this 
invention has the binding specificity of a monoclonal 
antibody comprising a heavy chain immunoglobulin variable 
region amino acid residue sequence selected from the group 
of sequences consisting of SEQ ID NOs 66, 67, 68, 70, 72, 
73, 74, 75, 78 and 97, and conservative substitutions 
thereof . 
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Another preferred human monoclonal antibody of 
this invention has the binding specificity of a monoclonal 
antibody having a light chain immunoglobulin variable 
region amino acid residue sequence selected from the group 
5 of sequences consisting of SEQ ID NOs 95, 96, 97, 98, 101, 

102, 103, 104, 105, 107, 110, 115,. 118, 121, 122, 124 and 
132, and conservative substitutions thereof. 

In a preferred embodiment, a monoclonal antibodies 
of this invention exhibits a potent capacity to neutralize 

10 HIV. The capacity to neutralize HIV is expressed as a 

concentration of antibody molecules required to reduce the 
infectivity titer of a suspension of HIV when assayed in 
an typical in vitro infectivity assay, such as is 
described herein. A monoclonal antibody of this invention 

15 has the capacity to reduce HIV infectivity titer in an in 

vitro virus infectivity assay by 50% at a concentration of 
less than 700 nanograms (ng) of antibody per milliliter 
(ml) of culture medium in the assay, and preferably 
reduces infectivity titers 50% at a concentration of less 

20 than 300 ng/ml, and more preferably at concentrations less 

than about 10 ng/ml. 

Exemplary and preferred monoclonal antibodies 
described herein are effective at 3-700 ng/ml, and 
"therefore are particularly well suited for inhibiting HIV 

25 in vitro and in vivo. 

Particularly preferred human monoclonal antibodies 
of this invention immunoreact with gpl20 in its "mature" 
form, which form is to be distinguished from antigenic 
determinants present on the HIV envelope precursor 

30 glycoprotein designated gpl60. gpl60 is processed during 

virus biogenesis by cleavage into two polypeptides, gp41 
and gpl20. "Mature" gpl20 refers to the processed protein 
that is found in mature HIV virus particles, and can be 
detected on the surface of HIV-infected cells. 

35 Thus, a preferred antibody of this invention binds 

mature gpl20 preferentially over HIV precursor 
glycoprotein gpl60. By "binds preferentially" is meant 
that the antibody immunoreacts with (binds) substantially 
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more mature gpl20 than gpl60 in an immunoreaction 
admixture. Substantially more typically indicates that at 
least greater than 50 % of the total mass of 
immunoprecipitated material is gp!20, and preferably 
indicates that at least greater than 75 %, more preferably 
90 %, of the immunoprecipitated material is gpl20. 

Methods for determining immunoreaction of a 
subject antibody with gpl20 or gpl60 are well known in the 
art, and the invention need not be so limited. However, 
preferred methods for determining the relative amounts of 
envelope glycoprotein antigens are described in the 
Examples, and include radio- immunoprecipit at ion (RIP) of 
cell-surface labeled HIV-infected cells, followed by 
molecular weight analysis of the labeled products by 
polyacrylamide gel electrophoresis (PAGE) . 

A preferred human monoclonal antibody also has the 
ability to immunoreact with native gpl20 and comparatively 
bind substantially less of a variant gpl20 produced by 
recombinant DNA methods in which the VI and V2 loops have 
been deleted. The variant gpl20, also referred to a VI /V2 
loop deficient-variant gpl20, is described in the 
Examples, and is seen to bind substantially less of a 
preferred antibody, bl2, in comparison to native gpl20. 
The term "native gpl20" refers to a mature gpl20 protein 
having a normal amino acid residue sequence instead of a 
variant protein having selected amino acid residue 
substitutions or deletions, such as the V1/V2 loop 
deficient -variant in which the VI and V2 loops were 
deleted. This preferential binding with native gpl20 
compared to the V1/V2 loop deficient -variant identifies an 
important epitope defined by a preferred antibody of this 
invention. Antibodies having this binding epitope are 
particularly effective at neutralizing a majority of field 
isolates of HIV, as described herein. 

The ability to bind "substantially less" V1/V2 
loop deficient-variant gpl20 than native gpl20 can be 
readily measured using various immunoreaction detection 
methods, although the assay methods described in Example 
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5c are particularly preferred. In preferred embodiments, 
substantially less binding to V1/V2 loop deficient -variant 
gpl20 compared to native gpl20 is indicated when the 
comparison is conducted as described as in Example 5c, and 
5 the native gpl20 exhibits a ratio value deviating from the 

mean of greater than 2.0 and the variant exhibits a ratio 
value deviating from the mean of less than 0.5. 

A particularly preferred human monoclonal antibody 
of this invention also has the capacity to neutralize a 

10 majority of field isolates as disclosed herein. As is 

well understood, the field (i.e., clinically isolated) 
strains of HIV are typically different to some degree 
antigenically from laboratory strains. Therefor, it is 
well understood that useful neutralizing antibodies must 

15 immunoreact with, and be neutralizing against, field 

isolates of HIV. Preferably, the useful antibody 
neutralized a large percentage of field isolates, thereby 
increasing its effectiveness when new strains are 
encountered. 

20 The Examples demonstrate that the human monoclonal 

antibody bl2 has the ability to neutralize a majority of 
the field isolates tested. By majority is meant that in a 
representative and diverse collection of field isolates, 
the antibody is capable of neutralizing at least 50 % of 

25 the strains, and preferably at least 75 % of the strains 

tested. In this context, "neutralizing" means an effect 
of reducing the HIV infectivity titre in an in vitro virus 
infectivity assay as described herein at the antibody 
concentrations described. 

30 Thus, the invention also contemplates a human 

monoclonal antibody capable of immunoreact ing with and 
neutralizing a first preselected human immunodeficiency 
virus (HIV) , such as the laboratory isolate MN or I I IB, 
that is further capable of immunoreact ing with and 

35 neutralizing one or more other (i.e., second) strains of 

HIV, particularly field strains. In this embodiment, 
supported by the teachings of the Examples, the antibody 
has the capacity to reduce HIV infectivity titer in an in 
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vitro virus infectivity assay of the first HIV strain by 
50% at a concentration of at least less than 700 nanograms 
(ng) of antibody per milliliter (ml), and has the 
capacity to reduce HIV infectivity titer of a second field 
strain of HIV in the same in vitro virus infectivity assay 
by 50% at a concentration of less than about 700 nanograms 
(ng) of antibody per milliliter (ml) . In more preferred 
embodiments and depending upon the particular HIV strain, 
the capacity to reduce second field strain infectivity 
titers by 50% can be exhibited at lower antibody 
concentrations, such as below 300 ng/ml. 

A particularly preferred antibody is an antibody 
having the binding specificity of the b!2 monoclonal 
antibody described herein. The amino acid residue 
sequence of the heavy chain variable region of bl2 is 
shown in SEQ ID NO 66, and the light chain variable region 
sequence of b!2 is shown in SEQ ID NO 97. Still more 
preferred are human antibodies having the binding 
specificity of the immunoglobulin heavy and light chain 
polypeptides produced by ATCC 69079. 

Further preferred human monoclonal antibodies 
immunoreact with the CD4 binding site of glycoprotein 
gp41. A preferred human monoclonal antibody of this 
invention has the binding specificity of a monoclonal 
antibody comprising a heavy chain immunoglobulin variable 
region amino acid residue sequence selected from the group 
of sequences consisting of SEQ ID NOs 142, 143, 144, 145, 
and 146 and conservative substitutions thereof. 

Another preferred human monoclonal antibody of 
this invention has the gp4l binding specificity of a 
monoclonal antibody having a light chain immunoglobulin 
variable region amino acid residue sequence selected from 
the group of sequences consisting of SEQ ID NOs 147, 148, 
149, 150, and 151 and conservative substitutions thereof. 

As shown by the present teachings and using the 
combinatorial library shuffling and screening methods, one 
can identify new heavy and light chain pairs (H:L) that 
function as a HIV-neutralizing monoclonal antibody. In 
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particular, one can shuffle a known heavy chain, derived 
from an HIV-neutralizing human monoclonal antibody, with a 
library of light chains to identify new H:L pairs that 
form a functional antibody according to the present 
5 invention. Similarly, one can shuffle a known light 

chain, derived from an HIV-neutralizing human monoclonal 
antibody, with a library of heavy chains to identify new 
H:L pairs that form a functional antibody according to the 
present invention . 

10 ' Particularly preferred human monoclonal antibodies 

are those having the gpl20 immunoreaction (binding) 
specificity of a monoclonal antibody having heavy and 
light chain immunoglobulin variable region amino acid 
residue sequences in pairs <H:L) selected from the group 

15 consisting of SEQ ID NOs 66:95, 67:96, 72:102, 66:97, 

73:107, 74:103, 70:101, 68:98, 75:104, 72:105, 78:110, 
66:118, 66:122, 66:121, 66:115, 97:124, 97:132 and 66:98, 
and conservative substitutions thereof. The designation 
of two SEQ ID NOs with a colon, e.g., 66:95, is to connote 

20 a H:L pair' formed by the heavy and light chain, 

respectively, amino acid residue sequences shown in SEQ ID 
NO 66 and SEQ ID NO 95, respectively. 

Further preferred human monoclonal antibodies are 
those having the gp41 immunoreaction (binding) specificity 

25 of a monoclonal antibody having heavy and light chain 

immunoglobulin variable region amino acid residue 
sequences in pairs (H:L) selected from the group 
consisting of SEQ ID NOs 142:147, 143:148, 144:149, 
145:150, and 146:151, and conservative substitutions 

30 thereof. 

Particularly preferred are human monoclonal 
antibodies having the binding specificity of the 
monoclonal antibody produced by the E. coli microorganisms 
deposited with the ATCC, as described further herein. 
35 Particularly preferred are human monoclonal 

antibodies having the binding specificity of the 
monoclonal antibodies produced by the E . coli 
microorganisms designated ATCC 69078, 69079 and 69080. By 
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"having the binding specificity" is meant equivalent 
monoclonal antibodies which exhibit the same or similar 
immunoreaction and neutralization properties, and which 
compete for binding to an HIV antigen. Preferred are the 
human monoclonal antibodies produced by ATCC 69078, 69079 
and 69080. 

The term "conservative variation" as used herein 
denotes the replacement of an amino acid residue by 
another, biologically similar residue. Examples of 
conservative variations include the substitution of one 
hydrophobic residue such as isoleucine, valine, leucine or 
methionine for another, or the substitution of one polar 
residue for another, such as the substitution of arginine 
for lysine, glutamic for aspartic acids, or glutamine for 
asparagine, and the like. The term "conservative 
variation" also includes the use of a substituted amino 
acid in place of an unsubstituted parent amino acid 
provided that antibodies having the substituted 
polypeptide also neutralize HIV. Analogously, another 
preferred embodiment of the invention relates to 
polynucleotides which encode the above noted heavy and/or 
light chain polypeptides and to polynucleotide sequences 
which are complementary to these polynucleotide sequences. 
Complementary polynucleotide sequences include those 
sequences which hybridize to the polynucleotide sequences 
of the invention under stringent hybridization conditions. 

By using the human monoclonal antibodies of the 
invention, it is now possible to produce anti -idiotypic 
antibodies which can be used to screen human monoclonal 
antibodies to identify whether the antibody has the same 
binding specificity as a human monoclonal antibody of the 
invention and also used for active immunization (Herlyn et 
al., Science . 232:100 (1986)). Such anti -idiotypic 
antibodies can be produced using well-known hybridoma 
techniques (Kohler et al., Nature . 256:495 (1975)). An 
anti -idiotypic antibody is an antibody which recognizes 
unique determinants present on the human monoclonal 
antibody produced by the cell line of interest. These 
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determinants are located in the hypervariable region of 
the antibody. It is this region which binds to a given 
epitope and, thus, is responsible for the specificity of 
the antibody. An anti- idiotypic antibody can be prepared 
5 by immunizing an animal with the monoclonal antibody of 

interest. The immunized animal will recognize and respond 
to the idiotypic determinants of the immunizing antibody 
and produce an antibody to these idiotypic determinants. 
By using the anti- idiotypic antibodies of the immunized 

10 animal, which are specific for the human monoclonal 

antibody of the invention produced by a cell line which 
was used to immunize the second animal, it is now possible 
to identify other clones with the same idiotype as the 
antibody of the hybridoma used for immunization. 

15 Idiotypic identity between human monoclonal antibodies of 

two cell lines demonstrates that the two monoclonal 
antibodies are the same with respect to their recognition 
of the same epitopic determinant. Thus, by using anti- 
idiotype antibodies, it is possible to identify other 

20 hybridomas expressing monoclonal antibodies having the 

same epitopic specificity. 

It is also possible to use the anti-idiotype 
technology to produce monoclonal antibodies which mimic an 
epitope. For example, an anti-idiotypic monoclonal 

25 antibody made to a first monoclonal antibody will have a 

binding domain in the hypervariable region which is the 
"image" of the epitope bound by the first monoclonal 
antibody. Thus, the anti-idiotypic monoclonal antibody 
can be used for immunization, since the anti-idiotype 

30 monoclonal antibody binding domain effectively acts as an 

antigen.- 

In one preferred embodiment, the invention 
contemplates a truncated immunoglobulin molecule 
comprising a Fab fragment derived from a human monoclonal 
35 antibody of this invention. The Fab fragment, lacking Fc 

receptor, is soluble, and affords therapeutic advantages 
in serum half life, and diagnostic advantages in modes of 
using the soluble Fab fragment. The preparation of a 
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soluble Fab fragment is generally known in the 
immunological arts and can be accomplished by a variety of 
methods. A preferred method of producing a soluble Fab 
fragment is described herein. 

In another preferred embodiment, the invention 
contemplates an immunoglobulin molecule comprising a Fab 
fragment derived from a human monoclonal antibody of this 
invention and the fragment crystallizable {Fc) domain of a 
human immunoglobulin molecule. The entire (i.e., 
complete) immunoglobulin (Ig) molecule comprising a Fab 
fragment with the Fc domain may afford therapeutic and 
diagnostic advantages, and can be any of the several Ig 
species depending upon the ultimate use, including IgG, 
IgA, IgD, IgE, IgM, and isotypes thereof. The 
immunoglobulin molecule would be capable of effector 
functions associated with the Fc domain when used in 
passive immunotherapy. These effector functions include 
antibody- dependent cellular cytotoxicity (ADCC) and 
complement -dependent cellular cytotoxicity (CDCC) which 
promote the death of the cell to which the immunoglobulin 
molecule is specifically bound. The effector functions 
may therefore be desirable in therapeutic applications. 
Diagnostic assays include the ability to detect the 
presence of the immunoglobulin molecule. These assays 
rely on the cross-linking of red cells or beads in 
agglutinations, the activation of complement in plaque 
assays, or the antigenic properties of the Fc region of 
the heavy chain as detected by secondary antibodies in 
ELISA or RIA procedures to detect the presence of the 
immunoglobulin molecule. Such diagnostic assays can only 
be performed with the entire immunoglobulin molecule. The 
isolation of the immunoglobulin molecule is also 
facilitated by the presence of the Fc domain in that 
commonly used methods of immunoglobulin purification are 
based upon interaction of reagents with the Fc domain. 
The preparation of a Fab fragment with the Fc domain is 
generally known in the immunological arts and can be 
accomplished by a variety of methods. A preferred method 
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of producing a Fab fragment with the Fc domain is 
described herein. 

Particularly preferred is the immunoglobulin IgGl 
human antibody described herein that is comprised of the 
5 bl2 antibody Fab fragment and human Fc domain derived from 

an IgGl subtype, designated bl2 IgGl. The structure and 
preparation of this preferred human monoclonal antibody is 
described herein, and is prepared using the recombinant 
DNA expression vector pEE12. The complete nucleotide 

10 sequence of the vector for expression the complete heavy 

and light chains in the form of bl2 IgGl is shown in 
Figure 27 and also in SEQ ID NOs 156 and 170. 
Accordingly, the amino acid residue and nucleotide 
sequences, respectively, for a preferred complete heavy 

15 chain are shown in SEQ ID NOs 155 and 154, respectively, 

and for a preferred light chain are shown in SEQ ID NOs 
153, and 152, respectively. The nucleotide sequences for 
preferred heavy and light chains are also shown in SEQ ID 
NOs 169 and 168, respectively. 

20 

C. Immunotherapeutic Methods and Compositions 
The human monoclonal antibodies can also be 
used immunotherapeutically for HIV disease. The term 
"immuno therapeutically" or "immunotherapy" as used herein 

25 in conjunction with the monoclonal antibodies of the 

invention denotes both prophylactic as well as therapeutic 
administration. Thus, the monoclonal antibodies can be 
administered to high-risk patients in order to lessen the 
likelihood and/or severity of HIV- induced disease, 

30 administered to patients already evidencing active HIV 

infection, or administered to patients at risk of HIV 
infection. 

1 . Therapeutic Compositions 
35 The present invention therefore 

contemplates therapeutic compositions useful for 
practicing the therapeutic methods described herein. 
Therapeutic compositions of the present invention contain 
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a physiologically tolerable carrier together with at least 
one species of human monoclonal antibody as described 
herein, dissolved or dispersed therein as an active 
ingredient. In a preferred embodiment, the therapeutic 
composition is not immunogenic when administered to a 
human patient for therapeutic purposes, unless that 
purpose is to induce an immune response, as described 
elsewhere herein. 

As used herein, the terms "pharmaceutically 
acceptable", "physiologically tolerable" and grammatical 
variations thereof, as they refer to compositions, 
carriers, diluents and reagents, are used interchangeably 
and represent that the materials are capable of 
administration to or upon a human without the production 
of undesirable physiological effects such as nausea, 
dizziness, gastric upset and the like. 

The preparation of a pharmacological composition 
that contains active ingredients dissolved or dispersed 
therein is well understood in the art. Typically such 
compositions are prepared as sterile injectables either as 
liquid solutions or suspensions, aqueous or non-aqueous, 
however, solid forms suitable for solution, or 
suspensions, in liquid prior to use can also be prepared. 
The preparation can also be emulsified. 

The active ingredient can be mixed with excipients 
which are pharmaceutically acceptable and compatible with 
the active ingredient and in amounts suitable for use in 
the therapeutic methods described herein. Suitable 
excipients are, for example, water, saline, dextrose, 
glycerol, ethanol or the like and combinations thereof. 
In addition, if desired, the composition can contain minor 
amounts of auxiliary substances such as wetting or 
emulsifying agents, pH buffering agents and the like which 
. enhance the effectiveness of the active ingredient. 

The therapeutic composition of the present 
invention can include pharmaceutically acceptable salts of 
the components therein. Pharmaceutically acceptable salts 
include the acid addition salts (formed with the free 
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amino groups of the polypeptide) that are formed with 
inorganic acids such as, for example, hydrochloric or 
phosphoric acids, or such organic acids as acetic, 
tartaric, mandelic and the like. Salts formed with the 
5 free carboxyl groups can also be derived from inorganic 

bases such as, for example, sodium, potassium, ammonium, 
calcium or ferric hydroxides, and such organic bases as 
isopropylamine, trimethylamine, 2-ethylamino ethanol, 
histidine, procaine and the like. 

10 Physiologically tolerable carriers are well known 

in the art. Exemplary of liquid carriers are sterile 
aqueous solutions that contain no materials in addition to 
the active ingredients and water, or contain a buffer such 
as sodium phosphate at physiological pH value, 

15 physiological saline or both, such as phosphate-buffered 

saline. Still further, aqueous carriers can contain more 
than one buffer salt, as well as salts such as sodium and 
potassium chlorides, dextrose, propylene glycol, 
polyethylene glycol and other solutes. 

20 Liquid compositions can also contain liquid phases 

in addition to and to the exclusion of water. Exemplary 
of such additional liquid phases are glycerin, vegetable 
oils such as cottonseed oil, organic esters such as ethyl 
oleate, and water-oil emulsions. 

25 A therapeutic composition contains an HIV- 

neutralizing of a human monoclonal antibody of the present 
invention, typically an amount of at least 0.1 weight 
percent of antibody per weight of total therapeutic 
composition. A weight percent is a ratio by weight of 

30 antibody to total composition. Thus, for example, 0.1 

weight percent is 0.1 grams of antibody per 100 grams of 
total composition. 

2. Therapeutic Methods 
35 In view of the demonstrated HIV 

neutralizing ability of the human monoclonal antibodies of 
the present invention, the present disclosure provides for 
a method for neutralizing HIV in vitro or in vivo. The 
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method comprises contacting a sample believed to contain 
HIV with a composition comprising a therapeutically 
effective amount of a human monoclonal antibody of this 
invention. 

For in vivo modalities, the method comprises 
administering to the patient a therapeutically effective 
amount of a physiologically tolerable composition 
containing a human monoclonal antibody of the invention. 
Thus, the present invention describes in one embodiment a 
method for providing passive immunotherapy to HIV disease 
in a human comprising administering to the human an 
immunotherapeutically effective amount of the monoclonal 
antibody of this invention. 

A representative patient for practicing the 
present passive immunotherapeutic methods is any human 
exhibiting symptoms of HIV- induced disease, including AIDS 
or related conditions believed to be caused by HIV 
infection, and humans at risk of HIV infection. Patients 
at risk of infection by HIV include babies of HIV-infected 
pregnant mothers, recipients of transfusions known to 
contain HIV, users of HIV contaminated needles, 
individuals who have participated in high risk sexual 
activities with known HIV-infected individuals, and the 
like risk situations. 

In one embodiment, the passive immunization method 
comprises administering a composition comprising more than 
one species of human monoclonal antibody of this 
invention, preferably directed to non-competing epitopes 
or directed to distinct serotypes or strains of HIV, as to 
afford increased effectiveness of the passive 
immunotherapy . 

A therapeutically (immunotherapeutically) 
effective amount of a human monoclonal antibody is a 
predetermined amount calculated to achieve the desired 
effect, i.e., to neutralize the HIV present in the sample 
or in the patient, and thereby decrease the amount of 
detectable HIV in the sample or patient . In the case of 
in vivo therapies, an effective amount can be measured by 
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improvements in one or more symptoms associated with HIV- 
induced disease occurring in the patient, or by 
serological decreases in HIV antigens. 

Thus, the dosage ranges for the administration of 
5 the monoclonal antibodies of the invention are those large 

enough to produce the desired effect in which the symptoms 
of the HIV disease are ameliorated or the likelihood of 
infection decreased. The dosage should not be so large as 
to cause adverse side effects, such as hyperviscosity 

10 syndromes, pulmonary edema, congestive heart failure, and 

the like. Generally, the dosage will vary with the age, 
condition, sex and extent of the disease in the patient 
and can be determined by one of skill in the art. 

The dosage can be adjusted by the individual 

15 physician in the event of any complication. 

A therapeutically effective amount of an antibody 
of this invention is typically an amount of antibody such 
that when administered in a physiologically tolerable 
composition is sufficient to achieve a plasma 

20 concentration of from about 0.1 microgram (ug) per 

milliliter (ml) to about 100 ug/tnl, preferably from about 
1 ug/ml to about 5 ug/ml, and usually about 5 ug/ml . 
Stated differently, the dosage can vary from about 0.1 
mg/kg to about 300 mg/kg, preferably from about 0.2 mg/kg 

25 to about 200 mg/kg, most preferably from about 0.5 mg/kg 

to about 20 mg/kg, in one or more dose administrations 
daily, for one or several days. 

The human monoclonal antibodies of the invention 
can be administered parenterally by injection or by 

30 gradual infusion over time. Although the HIV infection is 

typically systemic and therefore most often treated by 
intravenous administration of therapeutic compositions, 
other tissues and delivery means are contemplated where 
there is a likelihood that the tissue targeted contains 

3 5 infectious HIV. Thus, human monoclonal antibodies of the 

invention can be administered intravenously, 
intraper i toneally , intramuscularly , subcutaneously , 
intracavity, transdermally , and can be delivered by 
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The therapeutic compositions containing a human 
monoclonal antibody of this invention are conventionally 
administered intravenously, as by injection of a unit 
dose, for example. The term "unit dose" when used in 
reference to a therapeutic composition of the present 
invention refers to physically discrete units suitable as 
unitary dosage for the subject, each unit containing a 
predetermined quantity of active material calculated to 
produce the desired therapeutic effect in association with 
the required diluent; i.e., carrier, or vehicle. 

The compositions are administered in a manner 
compatible with the dosage formulation, and in a 
therapeutically effective amount. The quantity to be 
administered depends on the subject to be treated, 
capacity of the subject's system to utilize the active 
ingredient, and degree of therapeutic effect desired. 
Precise amounts of active ingredient required to be 
administered depend on the judgement of the practitioner 
and are peculiar to each individual. However, suitable 
dosage ranges for systemic application are disclosed 
herein and depend on the route of administration. 
Suitable regimes for administration are also variable, but 
are typified by an initial administration followed by 
repeated doses at one or more hour intervals by a 
subsequent injection or other administration. 
Alternatively, continuous intravenous infusion sufficient 
to maintain concentrations in the blood in the ranges 
specified for in vivo therapies are contemplated. 

As an aid to the administrat ion of effective 
amounts of a monoclonal antibody, a diagnostic method for 
detecting a monoclonal antibody in the subject's blood is 
useful to characterize the fate of the administered 
therapeutic composition. 

The invention also relates to a method for 
preparing a medicament or pharmaceutical composition 
comprising the human monoclonal antibodies of the 
invention, the medicament being used for immunotherapy of 
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HIV disease. 

D. Diagnostic Assay Methods 

The present invention contemplates various 
5 assay methods for determining the presence, and preferably 

amount, of HIV in a sample such as a biological fluid or 
tissue sample using a human monoclonal antibody of this 
invention as an immunochemical reagent to form an 
immunoreaction product whose amount relates, either 
10 directly or indirectly, to the amount of HIV in the 

sample . 

In a related embodiment, the present invention 
contemplates various assay methods for determining the 
presence, and preferably amount, of an ant i -HIV antibody 

15 present in a sample such as a biological fluid or tissue 

sample from a HIV-infected individual using a human 
monoclonal antibody of this invention as an immunochemical 
reagent to form an immunoreaction product whose amount 
relates, either directly or indirectly, to the amount of 

20 anti-HIV antibody in the sample. 

Those skilled in the art will understand that 
there are numerous well known clinical diagnostic 
chemistry procedures in which an immunochemical reagent of 
this invention can be used to form an immunoreaction 

25 product whose amount relates to the amount of HIV or anti- 

HIV antibody present in a body sample. Thus, while 
exemplary assay methods are described herein, the 
invention is not so limited. 

Various heterogenous and homogeneous protocols, 

30 either competitive or noncompetitive, can be employed in 

performing an assay method of this invention. Examples of 
types of immunoassays which can utilize monoclonal 
antibodies of the invention are competitive and non- 
competitive immunoassays in either a direct or indirect 

35 format. Examples of such immunoassays are the radioimmu- 

noassay (RIA) and the sandwich ( immunometric) assay. 

Detection of the antigens using the monoclonal 
antibodies of the invention can be done utilizing immun- 
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oassays which are run in either the forward, reverse, or 
simultaneous modes, including immunohistochemical assays 
on physiological samples. Those of skill in the art will 
know, or can readily discern, other immunoassay formats 
without undue experimentation. 

The monoclonal antibodies of the invention can be 
bound to many different carriers and used to detect the 
presence of HIV. Examples of well-known carriers include 
glass, polystyrene, polypropylene, polyethylene, dextran, 
nylon, amylases, natural and modified celluloses, 
polyacrylamides, agaroses and magnetite. The nature of 
the carrier can be either soluble or insoluble for 
purposes of the invention. Those skilled in the art will 
know of other suitable carriers for binding monoclonal 
antibodies, or will be able to ascertain such, using 
routine experimentation. 

There are many different labels and methods of 
labeling known to those of ordinary skill in the art. 
Examples of the types of labels which can be used in the 
present invention include enzymes, radioisotopes, 
fluorescent compounds, colloidal metals, chemiluminescent 
compounds, and bio-luminescent compounds. Those of 
ordinary skill in the art will know of other suitable 
labels for binding to the monoclonal antibodies of the 
invention, or will be able to ascertain such, using 
routine experimentation. Furthermore, the binding of 
these labels to the monoclonal antibodies of the invention 
can be done using standard techniques common to those of 
ordinary skill in the art. 

For purposes of the invention, HIV may be detected 
by the monoclonal antibodies of the invention when present 
in samples of biological fluids and tissues. Any sample 
containing a detectable amount of HIV can be used. A 
sample can be a liquid such as urine, saliva, 
cerebrospinal fluid, blood, serum and the like, or a solid 
or semi-solid such as tissues, feces, and the like, or, 
alternatively, a solid tissue such as those commonly used 
in histological diagnosis. 
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Another labeling technique which may result in 
greater sensitivity consists of coupling the antibodies to 
low molecular weight haptens. These haptens can then be 
specifically detected by means of a second reaction. For 
5 example, it is common to use haptens such as biotin, which 

reacts with avidin, or dinitrophenol, pyridoxal, or 
fluorescein, which can react with specific ant i -hapten 
antibodies . 

The monoclonal antibodies of the invention are 

10 suited for use in vitro , for example, in immunoassays in 

which they can be utilized in liquid phase or bound to a 
solid phase carrier for the detection of HIV in samples, 
as described above. The monoclonal antibodies in these 
immunoassays can be detectably labeled in various ways for 

15 in vitro use. 

In using the human monoclonal antibodies of the 
invention for the in vivo detection of antigen, the 
detectably labeled human monoclonal antibody is given in a 
dose' which is diagnostically effective. The term 

20 "diagnostically effective" means that the amount of 

detectably labeled human monoclonal antibody is 
administered in sufficient quantity to enable detection of 
the site having the HIV antigen for which the monoclonal 
antibodies are specific. 

25 The concentration of detectably labeled human 

monoclonal antibody which is administered should be 
sufficient such that the binding to HIV is detectable 
compared to the background. Further, it is desirable that 
the detectably labeled monoclonal antibody be rapidly 

30 cleared from the circulatory system in order to give the 

best target-to-background signal ratio. 

As a rule, the dosage of detectably labeled human 
monoclonal antibody for in vivo diagnosis will vary 
depending on such factors as age, sex, and extent of 

35 disease of the individual. The dosage of human monoclonal 

antibody can vary from about 0.01 mg/m 2 to about 500 mg/m 2 , 
preferably 0.1 mg/m 2 to about 200 mg/m 2 , most preferably 
about 0.1 mg/m 2 to about 10 mg/m 2 . Such dosages may vary, 
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for example, depending on whether multiple injections are 
given, tissue, and other factors known to those of skill 
in the art . 

For in vivo diagnostic imaging, the type of 
detection instrument available is a major factor in 
selecting a given radioisotope. The radioisotope chosen 
must have a type of decay which is detectable for a given 
type of instrument. Still another important factor in 
selecting a radioisotope for in vivo diagnosis is that the 
half-life of the radioisotope be long enough so that it is 
still detectable at the time of maximum uptake by the 
target, but short enough so that deleterious radiation 
with respect to the host is minimized. Ideally, a 
radioisotope used for in vivo imaging will lack a particle 
emission, but produce a large number of photons in the 
140-250 keV range, which may be readily detected by 
conventional gamma cameras. 

For in vivo diagnosis radioisotopes may be bound 
to immunoglobulin either directly or indirectly by using 
an intermediate functional group. Intermediate functional 
groups which often are used to bind radioisotopes which 
exist as metallic ions to immunoglobulins are the bi- 
functional chelating agents such as diethylenetriam- 
inepentacetic acid (DTPA) and ethylenediaminetetraacetic 
acid (EDTA) and similar molecules. Typical examples of 
metallic ions which can be bound to the monoclonal 
antibodies of the invention are 111 In, 97 Ru, 67 Ga, M Ga, 72 As, 
89 Zr, and 201 T1. 

The monoclonal antibodies of the invention can 
also be labeled with a paramagnetic isotope for purposes 
of in vivo diagnosis, as in magnetic resonance imaging 
(MRI) or electron spin resonance (ESR) . In general, any 
conventional method for visualizing diagnostic imaging can 
be utilized. Usually gamma and positron emitting 
radioisotopes are used for camera imaging and paramagnetic 
isotopes for MRI. Elements which are particularly useful 
in such techniques include 157 Gd, 55 Mn, 162 Dy, 52 Cr, and ^Fe. 
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The human monoclonal antibodies of the invention 
can be used in vitro and in vivo to monitor the course of 
HIV disease therapy. Thus, for example, by measuring the 
increase or decrease in the number of cells infected with 
5 HIV or changes in the concentration of HIV present in the 

body or in various body fluids, it would be possible to 
determine whether a particular therapeutic regimen aimed 
at ameliorating the HIV disease is effective. 

In a related diagnostic embodiment, the invention 

10 contemplates screening HIV-infected patients for the 

presence of circulating anti-HIV antibodies immunoreactive 
with gp!20 that have a similar epitope immunospecif icity 
when compared to a neutralizing antibody of this 
invention. Such a screening method indicates that the 

15 HIV-infected patient is exhibiting a significant immune 

response to the virus, and provides useful information 
regarding disease status and prognosis. The presence of 
anti-HIV antibodies cross-reactive with a neutralizing 
antibody of this invention indicates that the patient has 

20 some degree of HIV neutralizing activity, as defined 

herein. 

The diagnostic assay involves determining whether 
the patient contains human anti-HIV antibodies 
immunoreactive with the same, similar or overlapping 

25 epitopes as a neutralizing antibody of the invention, such 

that there is a likelihood that there is a useful 
neutralizing immune response in the patient. There are a 
variety of immunological assay formats that can be 
utilized to determine cross-reactivity of test and control 

30 antibodies, and the invention need not be so limiting. 

Particularly preferred are competition assays for a common 
antigen, preferably in the solid phase. 

A preferred embodiment of the competition 
35 immunoassay method comprises the steps of: 

(1) contacting a sample believed to contain 
a human anti-HIV antibody with a diagnostically effective 
amount of the monoclonal antibody described herein that 
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binds mature gpl20 in a competition immunoreaction 
admixture containing mature gpl20 in the solid phase; 

(2) maintaining said competition 
immunoreaction admixture under conditions sufficient for 
said monoclonal antibody to bind with said gpl20 in the 
solid phase and form a solid phase immunoreactant ; and 

(3) detecting the amount of said 
immunoreactant present in said solid phase, and thereby 
the immunocompetence of any human anti-HIV antibody in 
said sample. 

A diagnostically effective amount, in this 
context, is a amount relative to the solid phase gpl20, 
preferably "mature" gpl20 as defined herein, sufficient to 
produce a detectable solid phase immunoreaction product 
between the solid phase gpl20 and the control anti-gpl20 
antibody of this invention. Exemplary competition assays 
are described herein using the preferred bl2 antibody. 

Conditions for conducting the competition 
immunoreaction are well known in the art and can be varied 
according to recognized parameters in the contacting, the 
reaction admixtures, the maintenance step, the 
immunoreaction conditions and the detecting step. For 
example, the detection step can be conducted by use of a 
labeled antibody of this invention, by use of a second, 
labeled anti-human antibody, and the like, as described 
herein. 

E. Diagnostic Systems 

The present invention also describes a 
diagnostic system, preferably in kit form, for assaying 
for the presence of HIV or an anti-HIV antibody in a 
sample according to the diagnostic methods described 
herein. A diagnostic system includes, in an amount 
sufficient to perform at least one assay, a subject human 
monoclonal antibody, as a separately packaged reagent. 

In another embodiment, a diagnostic system is 
contemplated for assaying for the presence of an anti-HIV 
monoclonal antibody in a body fluid sample such as for 
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monitoring the fate of therapeutically administered 
antibody. The system includes, in an amount sufficient 
for at least one assay, a subject antibody as a control 
reagent, and preferably a preselected amount of HIV 
5 antigen, each as separately packaged immunochemical 

reagents . 

Instructions for use of the packaged reagent are 
also typically included. 

"Instructions for use" typically include a 

10 tangible expression describing the reagent concentration 

or at least one assay method parameter such as the 
relative amounts of reagent and sample to be admixed, 
maintenance time periods for reagent/ sample admixtures, 
temperature, buffer conditions and the like. 

15 m embodiments for detecting HIV or anti-HIV 

antibody in a body fluid, a diagnostic system of the 
present invention can include a label or indicating means 
capable of signaling the formation of an immunocomplex 
containing a human monoclonal antibody of the present 

20 invention. 

The word "complex" as used herein refers to the 
product of a specific binding reaction such as an 
antibody- antigen reaction. Exemplary complexes are 
immunoreaction products. 

25 As used herein, the terms "label" and "indicating 

means" in their various grammatical forms refer to single 
atoms and molecules that are either directly or indirectly 
involved in the production of a detectable signal to 
indicate the presence of a complex. Any label or 

30 indicating means can be linked to or incorporated in an 

expressed protein, polypeptide, or antibody molecule that 
is part of an antibody or monoclonal antibody composition 
of the present invention, or used separately, and those 
atoms or molecules can be used alone or in conjunction 

35 with additional reagents. Such labels are themselves 

well-known in clinical diagnostic chemistry and constitute 
a part of this invention only insofar as they are utilized 
with otherwise novel proteins methods and/or systems. 
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The labeling means can be a fluorescent labeling 
agent that chemically binds to antibodies or antigens 
without denaturing them to form a fluorochrome (dye) that 
is a useful immunof luorescent tracer. Suitable 
fluorescent labeling agents are f luorochromes such as 
fluorescein isocyanate (FIC) , fluorescein isothiocyanate 
(FITC) , 5-dimethylamine-l-naphthalenesulfonyl chloride 
(DANSC) , tetramethylrhodamine isothiocyanate (TRITC) , 
lissamine, rhodamine 8200 sulphonyl chloride (RB 200 SO 
and the like. A description of immunofluorescence 
analysis techniques is found in DeLuca, 
"Immunofluorescence Analysis", in Antibody As a Tool . 
Marchalonis et al., eds., John Wiley & Sons, Ltd., pp. 
189-231 (1982), which is incorporated herein by reference. 

In preferred embodiments, the indicating group is 
an enzyme, such as horseradish peroxidase (HRP) , glucose 
oxidase, or the like. In such cases where the principal 
indicating group is an enzyme such as HRP or glucose 
oxidase, additional reagents are required to visualize the 
fact that a receptor- ligand complex (immunoreactant) has 
formed. Such additional reagents for HRP include hydrogen 
peroxide and an oxidation dye precursor such as 
diaminobenzidine. An additional reagent useful with 
glucose oxidase is 2, 2 • -amino-di- (3-ethyl-benzthiazoline- 
G-sulfonic acid) (ABTS) . 

Radioactive elements are also useful labeling 
agents and are used illustratively herein. An exemplary 
radiolabeling agent is a radioactive element that produces 
gamma ray emissions. Elements which themselves emit gamma 
rays, such as 124 I, 125 I, 12e I, 132 I and 51 Cr represent one 
class of gamma ray emission-producing radioactive element 
indicating groups. Particularly preferred is 125 I . 
Another group of useful labeling means are those elements 
such as 11 C, 1fl F, 15 0 and 13 N which themselves emit 
positrons. The positrons so emitted produce gamma rays 
upon encounters with electrons present in the animal 1 s 
body. Also useful is a beta emitter, such 111 indium of 3 H. 
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The linking of labels, i.e., labeling of, 
polypeptides and proteins is well known in the art. For 
instance, antibody molecules produced by a hybridoma can 
be labeled by metabolic incorporation of radioisotope- 
containing amino acids provided as a component in the 
culture medium. See, for example, Gal f re et al., Meth. 
Enzvmol . . 73:3-46 (1981). The techniques of protein 
conjugation or coupling through activated functional 
groups are particularly applicable. See, for example, 
Aurameas et al . , Scand. J. Immunol . . Vol. 8 Suppl . 7:7-23 
(1978), Rodwell et al . , Biotech. . 3:889-894 (1984), and 
U.S. Pat. No. 4,493,795. 

The diagnostic systems can also include, 
preferably as a separate package, a specific binding 
agent, A "specific binding agent" is a molecular entity 
capable of selectively binding a reagent species of the 
present invention or a complex containing such a species, 
but is not itself a polypeptide or antibody molecule 
composition of the present invention. Exemplary specific 
binding agents are second antibody molecules, complement 
proteins or fragments thereof, S. aureus protein A, and 
the like. Preferably the specific binding agent binds the 
reagent species when that species is present as part of a 
complex. 

In preferred embodiments, the specific binding 
agent is labeled. However, when the diagnostic system 
includes a specific binding agent that is not labeled, the 
agent is typically used as an amplifying means or reagent. 
In these embodiments, the labeled specific binding agent 
is capable of specifically binding the amplifying means 
when the amplifying means is bound to a reagent species- 
containing complex. 

The diagnostic kits of the present invention can 
be used in an "ELISA" format to detect the quantity of an 
antigen or antibody of this invention in a vascular fluid 
sample such as blood, serum, or plasma. "ELISA" refers to 
an enzyme -linked immunosorbent assay that employs an 
antibody or antigen bound to a solid phase and an enzyme- 
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antigen or enzyme -antibody conjugate to detect and 
quantify the amount of an antigen present in a sample. A 
description of the EL ISA technique is found in Chapter 22 
of the 4th Edition of Basic and Clinical Immunology by 
D.P. Sites et al . , published by Lange Medical Publications 
of Los Altos, CA in 1982 and in U.S. Patents No. 
3,654,090; No. 3,850,752; and No. 4,016,043, which are all 
incorporated herein by reference. 

Thus, in some embodiments, a human monoclonal 
antibody of the present invention can be affixed to a 
solid matrix to form a solid support that comprises a 

package in the subject diagnostic systems. 

A reagent is typically affixed to a solid matrix 

by adsorption from an aqueous medium although other modes 

of affixation applicable to proteins and 

polypeptides well known to those skilled in the art, can 

be used. 

Useful solid matrices are also well known in the 
art. Such materials are water insoluble and include the 
cross -linked dextran available under the trademark 
SEPHADEX from Pharmacia Fine Chemicals (Piscataway, NJ) ; 
agarose; beads of polystyrene beads about 1 micron to 
about 5 millimeters in diameter available from Abbott 
Laboratories of North Chicago, IL; polyvinyl chloride, 
polystyrene, cross -linked polyacrylamide , nitrocellulose- 
or nylon-based webs such as sheets, strips or paddles; or 
tubes, plates or the wells of a microtiter plate such as 
those made from polystyrene or polyvinylchloride. 

The reagent species, labeled specific binding 
agent or amplifying reagent of any diagnostic system 
described herein can be provided in solution, as a liquid 
dispersion or as a substantially dry power, e.g., in 
lyophilized form. Where the indicating means is an 
enzyme, the enzyme's substrate can also be provided in a 
separate package of a system. A solid support such as the 
before-described microtiter plate and one or more buffers 
can also be included as separately packaged elements in 
this diagnostic assay system. 
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The packaging materials discussed herein in 
relation to diagnostic systems are those customarily 
utilized in diagnostic systems. 

The term "package" refers to a solid matrix or 
5 material such as glass, plastic (e.g., polyethylene, 

polypropylene and polycarbonate) , paper, foil and the like 
capable of holding within fixed limits a diagnostic 
reagent such as a monoclonal antibody of the present 
invention. Thus, for example, a package can be a bottle, 

10 vial, plastic and plastic- foil laminated envelope or the 

like container used to contain a contemplated diagnostic 
reagent or it can be a microtiter plate well to which 
microgram quantities of a contemplated diagnostic reagent 
have been operatively affixed, i.e., linked so as to be 

15 capable of being immunologically bound by an antibody or 

polypeptide to be detected. 

The materials for use in the assay of the 
invention are ideally suited for the preparation of a kit. 
Such a kit may comprise a carrier means being 

20 compartmentalized to receive in close confinement one or 

more container means such as vials, tubes, and the like, 
each of the container means comprising one of the separate 
elements to be used in the method. For example, one of 
the container means may comprise a human monoclonal 

25 antibody of the invention which is, or can be, detectably 

labelled. The kit may also have containers containing any 
of the other above-recited immunochemical reagents used to 
practice the diagnostic methods. 

30 F. Methods for Producing an HIV -Neutralizing 

Human Monoclona l Antibody 

The present invention describes methods for 
producing novel HIV-neutralizing human monoclonal 
antibodies. The methods are based generally on the use of 
35 combinatorial libraries of antibody molecules which can be 

produced from a variety of sources, and include naive 
libraries, modified libraries, and libraries produced 
directly from human donors exhibiting an HIV- specific 
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immune response. 

The combinatorial library production and 
manipulation methods have been extensively described in 
the literature, and will not be reviewed in detail herein, 
except for those feature required to make and use unique 
embodiments of the present invention. However, the 
methods generally involve the use of a filamentous phage 
(phagemid) surface expression vector system for cloning 
and expressing antibody species of the library. Various 
phagemid cloning systems to produce combinatorial 
libraries have been described by others. See, for example 
the preparation of combinatorial antibody libraries on 
phagemids as described by Kang et al . , Proc. Natl. Acad. 
Sci. . USA , 88:4363-4366 (1991); Barbas et al . , Proc. Natl. 
Acad. Sci. . USA. 88:7978-7982 (1991); Zebedee et al., 
Proc. Natl. Acad. Sci., USA . 89:3175-3179 (1992); Kang et 
al., Proc. Natl. Acad. Sci.. USA . 88:11120-11123 (1991); 
Barbas et al., Proc. Natl. Acad. S ci.. USA . 89:4457-4461 
(1992); and Gram et al., Proc. Natl. Acad. Sci.. USA . 
89:3576-3580 (1992), which references are hereby 
incorporated by reference. 

In one embodiment, the method involves preparing a 
phagemid library of human monoclonal antibodies by using 
donor immune cell messenger RNA from HIV-infected donors. 
The donors can be symptomatic of AIDS, but in preferred 
embodiments the donor is asymptomatic, as the resulting 
library contains a substantially higher number of HIV- 
neutralizing human monoclonal antibodies. 

In another embodiment, the donor is naive relative 
to an immune response to HIV, i.e., the donor is not HIV- 
infected. Alternatively, the library can be synthetic, or 
can be derived from a donor who has an immune response to 
other antigens. 

The method for producing a human monoclonal 
antibody generally involves (1) preparing separate H and L 
chain-encoding gene libraries in cloning vectors using 
human immunoglobulin genes as a source for the libraries, 
(2) combining the H and L chain encoding gene libraries 
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into a single dicistronic expression vector capable of 
expressing and assembling a heterodimeric antibody 
molecule, (3) expressing the assembled heterodimeric 
antibody molecule on the surface of a filamentous phage 
particle, (4) isolating the surface-expressed phage 
particle using immunoaf f inity techniques such as panning 
of phage particles against a preselected antigen, thereby 
isolating one or more species of phagemid containing 
particular H and L chain-encoding genes and antibody 
molecules that immunoreact with the preselected antigen. 

As described herein the Examples, the resulting 
phagemid library can be manipulated to increase and/or 
alter the immunospecif icities of the monoclonal antibodies 
of the library to produce and subsequently identify 
additional, desirable, human monoclonal antibodies of the 
present invention . 

For example, the heavy (H) chain and light (L) 
chain immunoglobulin molecule encoding genes can be 
randomly mixed (shuffled) to create new HL pairs in an 
assembled immunoglobulin molecule. Additionally, either 
or both the H and L chain encoding genes can be 
mutagenized in the complementarity determining region 
(CDR) of the variable region of the immunoglobulin 
polypeptide, and subsequently screened for desirable 
immunoreact ion and neutralization capabilities. 

In one embodiment, the H and L genes can be cloned 
into separate, monocistronic expression vectors, referred 
to as a "binary" system described further herein. In this 
method, step (2) above differs in that the combining of H 
and L chain encoding genes occurs by the co- introduction 
of the two binary plasmids into a single host cell for 
expression and assembly of a phagemid having the surface 
accessible antibody heterodimer molecule. 

In one shuffling embodiment, the shuffling can be 
accomplished with the binary expression vectors, each 
capable of expressing a single heavy or light chain 
encoding gene, as described in Example 11. 
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In the present methods, the antibody molecules are 
monoclonal because the cloning methods allow for the 
preparation of clonally pure species of antibody producing 
cell lines. In addition, the monoclonal antibodies are 
human because the H and L chain encoding genes are derived 
from human immunoglobulin producing immune cells, such as 
spleen, thymus, bone marrow, and the like. 

The method of producing a HIV-neutralizing human 
monoclonal antibody also requires that the resulting 
antibody library, immunore active with a preselected HIV 
antigen, is screened for the presence of antibody species 
which have the capacity to neutralize HIV in one or more 
of the assays described herein for determining 
neutralization capacity. Thus, a preferred library of 
antibody molecules is first produced which binds to an HIV 
antigen, preferably gpl60, gpl20, gp41, the V3 loop region 
of gpl60, or the CD4 binding site of gpl20 and gp41, and 
then is screened for the presence of HIV-neutralizing 
antibodies as described herein. 

Additional libraries can be screened from shuffled 
libraries for additional HIV-immunoreactive and 
neutralizing human monoclonal antibodies. 

As a further characterization of the present 
invention the nucleotide and corresponding amino acid 
residue sequence of the antibody molecule's H or L chain 
encoding gene is determined by nucleic acid sequencing. 
The primary amino acid residue sequence information 
provides essential information regarding the antibody 
molecule's epitope reactivity. 

Sequence comparisons of identified HIV- 
immunoreactive monoclonal antibody variable chain region 
sequences are shown herein in Figures 10-13. The 
sequences are aligned based on sequence homology, and 
groups of related antibody molecules are identified 
thereby in which heavy chain or light chain genes share 
substantial sequence homology. 

An exemplary preparation of a human monoclonal 
antibody is described in the Examples. The isolation of a 
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particular vector capable of expressing an antibody of 
interest involves the introduction of the dicistronic 
expression vector into a host cell permissive for 
expression of filamentous phage genes and the assembly of 
5 phage particles. Where the binary vector system is used, 

both vectors are introduced in the host cell. Typically, 
the host is E. coli . Thereafter, a helper phage genome is 
introduced into the host cell containing the 
immunoglobulin expression vector (s) to provide the genetic 

10 complementation necessary to allow phage particles to be 

assembled. The resulting host cell is cultured to allow 
the introduced phage genes and immunoglobulin genes to be 
expressed, and for phage particles to be assembled and 
shed from the host cell. The shed phage particles are 

15 then harvested (collected) from the host cell culture 

media and screened for desirable immunoreaction and 
neutralization properties. Typically, the harvested 
particles are "panned" for immunoreaction with a 
preselected antigen. The strongly immunoreactive 

20 particles are then collected, and individual species of 

particles are clonally isolated and further screened for 
HIV neutralization. Phage which produce neutralizing 
antibodies are selected and used as a source of a human 
HIV neutralizing monoclonal antibody of this invention. 

25 Human monoclonal antibodies of this invention can 

also be produced by altering the nucleotide sequence of a 
polynucleotide sequence that encodes a heavy or light 
chain of a monoclonal antibody of this invention. For 
example, by site directed mutagenesis, one can alter the 

30 nucleotide sequence of an expression vector and thereby 

introduce changes in the resulting expressed amino acid 
residue sequence. Thus one can take the polynucleotide of 
SEQ ID NO 66, for example, and convert it into the 
polynucleotide of SEQ ID NO 67. Similarly, one can take a 

35 known polynucleotide and randomly alter it by random 

mutagenesis, reintroduce the altered polynucleotide into 
an expression system and subsequently screen the product 
H:L pair for HIV-neutralizing activity. 
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Site-directed and random mutagenesis methods are 
well known in the polynucleotide arts f and are not to be 
construed as limiting as methods for altering the 
nucleotide sequence of a subject polynucleotide. 

Due to the presence of the phage particle in an 
immunoaf f inity isolated antibody, one embodiment involves 
the manipulation of the resulting cloned genes to truncate 
the immunoglobulin -coding gene such that a soluble Fab 
fragment is secreted by the host E. coli cell containing 
the phagemid vector. Thus, the resulting manipulated 
cloned immunoglobulin genes produce a soluble Fab which 
can be readily characterized in ELISA assays for epitope 
binding studies, in competition assays with known anti-HIV 
antibody molecules, and in HIV neutralization assays. The 
solubilized Fab provides a reproducible and comparable 
antibody preparation for comparative and characterization 
studies. 

The preparation of soluble Fab is generally 
described in the immunological arts, and can be conducted 
as described herein in Example 2b6) , or as described by 
Burton et al . , Proc. Natl. Acad. Sci.. USA . 88:10134-10137 
(1991). 

G. Expression Vectors and Polynucleotides for 
Expressing Anti-HIV Monoclonal Antibodies 
The preparation of human monoclonal 
antibodies of this invention depends, in one embodiment, 
on the cloning and expression vectors used to prepare the 
combinatorial antibody libraries described herein. The 
cloned, immunoglobulin heavy and light chain genes can be 
shuttled between lambda vectors, phagemid vectors and 
plasmid vectors at various stages of the methods described 
herein . 

The phagemid vectors produce fusion proteins that 
are expressed on the surface of an assembled filamentous 
phage particle. 

A preferred phagemid vector of the present 
invention is a recombinant DNA (rDNA) molecule containing 
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a nucleotide sequence that codes for and is capable of 
expressing a fusion polypeptide containing, in the 
direction of amino- to carboxy- terminus, (1) a prokaryotic 
secretion signal domain, (2) a heterologous polypeptide 
5 defining an immunoglobulin heavy or light chain variable 

region, and (3) a filamentous phage membrane anchor 
domain. The vector includes DNA expression control 
sequences for expressing the fusion polypeptide, 
preferably prokaryotic control sequences. 

10 The filamentous phage membrane anchor is 

preferably a domain of the cpIII or cpVIII coat protein 
capable of associating with the matrix of a filamentous 
phage particle, thereby incorporating the fusion 
polypeptide onto the phage surface. 

15 The secretion signal is a leader peptide domain of 

a protein that targets the protein to the periplasmic 
membrane of gram negative bacteria. A preferred secretion 
signal is a pelB secretion signal. The predicted amino 
acid residue sequences of the secretion signal domain from 

20 two pelB gene product variants from Erwinia carotova are 

described in Lei et al., Nature , 331:543-546 (1988). 

The leader sequence of the pelB protein has 
previously been used as a secretion signal for fusion 
proteins (Better et al., Science . 240:1041-1043 (1988); 

25 Sastry et al., Proc. Natl - Acad. Sci . . USA, 86:5728-5732 

(1989); and Mullinax et al., Proc. Natl. Acad. Sci., USA, 
87:8095-8099 (1990)). Amino acid residue sequences for 
other secretion signal polypeptide domains from E. coli 
useful in this invention as described in Oliver, 

30 Escherichia coli and Salmonel la Tvphimurium, Neidhard, 

F.C. (ed..), American Society for Microbiology, Washington, 
D.C., 1:56-69 (1987). 

Preferred membrane anchors for the vector are 
obtainable from filamentous phage M13, fl, fd, and 

35 equivalent filamentous phage. Preferred membrane anchor 

domains are found in the coat proteins encoded by gene III 
and gene VIII. The membrane anchor domain of a 
filamentous phage coat protein is a portion of the carboxy 
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terminal region of the coat protein and includes a region 
of hydrophobic amino acid residues for spanning a lipid 
bilayer membrane, and a region of charged amino acid 
residues normally found at the cytoplasmic face of the 
membrane and extending away from the membrane. 

In the phage fl, gene VIII coat protein's membrane 
spanning region comprises residue Trp-26 through Lys-40, 
and the cytoplasmic region comprises the carboxy- terminal 
11 residues from 41 to 52 (Ohkawa et al . , J. Biol . Chem, . 
256:9951-9958 (1981)) . An exemplary membrane anchor would 
consist of residues 26 to 40 of cpVIII. Thus, the amino 
acid residue sequence of a preferred membrane anchor 
domain is derived from the M13 filamentous phage gene VIII 
coat protein (also designated cpVIII or CP 8). Gene VIII 
coat protein is present on a mature filamentous phage over 
the majority of the phage particle with typically about 
2500 to 3000 copies of the coat protein. 

In addition, the amino acid residue sequence of 
another preferred membrane anchor domain is derived from 
the M13 filamentous phage gene III coat protein (also 
designated cpIII) . Gene III coat protein is present on a 
mature filamentous phage at one end of the phage particle 
with typically about 4 to 6 copies of the coat protein. 

For detailed descriptions of the structure of 
filamentous phage particles, their coat proteins and 
particle assembly, see the reviews by Rached et al . , 
Microbiol. Rev. . 50:401-427 (1986); and Model et al . , in 
"The Bacteriophages: Vol. 2" , R. Calendar, ed. Plenum 
Publishing Co., pp. 375-456 (1988). 

DNA expression control sequences comprise a set of 
DNA expression signals for expressing a structural gene 
product and include both 5' and 3' elements, as is well 
known, operatively linked to the cistron such that the 
. cistron is able to express a structural gene product. The 
5' control sequences define a promoter for initiating 
transcription and a ribosome binding site operatively 
linked at the 5* terminus of the upstream translatable DNA 
sequence . 
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To achieve high levels of gene expression in 
coli , it is necessary to use not only strong promoters to 
generate large quantities of mRNA, but also ribosome 
binding sites to ensure that the mRNA is efficiently 
5 translated. In E. coli , the ribosome binding site 

includes an initiation codon (AUG) and a sequence 3-9 
nucleotides long located 3-11 nucleotides upstream from 
the initiation codon (Shine et al., Nature , 254:34 (1975) 
The sequence, AGGAGGU, which is called the Shine -Dalgarno 

10 (SD) sequence, is complementary to the 3* end of E. coli 

16S rRNA. Binding of the ribosome to mRNA and the 
sequence at the 3' end of the mRNA can be affected by 
several factors: 

(i) The degree of complementarity between 

15 the SD sequence and 3' end of the 16S rRNA. 

(ii) The spacing and possibly the DNA 
sequence lying between the SD sequence and the AUG. 
Roberts et al., Proc. Natl. Acad. Sci., USA, 76:760, 
(1979a); Roberts et al . , Proc. Nat l. Acad. Sci. USA. 

20 76:5596 (1979b); Guarente et al., Science , 209:1428 

(1980) ; and Guarente et al . , Cell , 20:543 (1980). 
Optimization is achieved by measuring the level of 
expression of genes in plasmids in which this spacing is 
systematically altered. Comparison of different mRNAs 

25 shows that there are statistically preferred sequences 

from positions -20 to +13 (where the A of the AUG is 
position 0). Gold et al., Annu. Rev. Microbiol., 35:365 

(1981) . Leader sequences have been shown to influence 
translation dramatically. Roberts et al . , 1979 a, b supra 

30 (iii) The nucleotide sequence following the 

AUG, which affects ribosome binding. Taniguchi et al., J 
Mol. Biol. , 118:533 (1978). 

The 3' control sequences define at least one 
termination (stop) codon in frame with and operatively 
3 5 linked to the heterologous fusion polypeptide. 

In preferred embodiments, the vector utilized 
includes a prokaryotic origin of replication or replicon, 
i.e., a DNA sequence having the ability to direct 
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autonomous replication and maintenance of the recombinant 
DNA molecule extra chromosomally in a prokaryotic host 
cell, such as a bacterial host cell, transformed 
therewith. Such origins of replication are well known in 
the art. Preferred origins of replication are those that 
are efficient in the host organism. A preferred host cell 
is E . coli. For use of a vector in E. coli . a preferred 
origin of replication is ColEl found in pBR322 and a 
variety of other common plasmids. Also preferred is the 
pl5A origin of replication found on pACYC and its 
derivatives. The ColEl and pl5A replicon have been 
extensively utilized in molecular biology, are available 
on a variety of plasmids and are described at least by 
Sambrook et al . , in "Molecular Cloning: a Laboratory 
Manual", 2nd edition, Cold Spring Harbor Laboratory Press 
(1989) . 

The ColEl and pl5A replicons are particularly 
preferred for use in one embodiment of the present 
invention where two "binary" plasmids are utilized because 
they each have the ability to direct the replication of 
plasmid in E. coli while the other replicon is present in 
a second plasmid in the same E. coli cell . In other 
words, ColEl and pl5A are non- interfering replicons that 
allow the maintenance of two plasmids in the same host 
(see, for example, Sambrook et al . , supra . at pages 1.3- 
1-4). This feature is particularly important in the 
binary vectors embodiment of the present invention because 
a single host cell permissive for phage replication must 
support the independent and simultaneous replication of 
two separate vectors, namely a first vector for expressing 
a heavy chain polypeptide, and a second vector for 
expressing a light chain polypeptide. 

In addition, those embodiments that include a 
prokaryotic replicon can also include a gene whose 
expression confers a selective advantage, such as drug 
resistance, to a bacterial host transformed therewith. 
Typical bacterial drug resistance genes are those that 
confer resistance to ampicillin, tetracycline, 
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neomycin/kanamycin or cholamphenicol . Vectors typically 
also contain convenient restriction sites for insertion of 
translatable DNA sequences. Exemplary vectors are the 
plasmids pUC8, pUC9, pBR322, and pBR329 available from 
5 BioRad Laboratories, (Richmond, CA) and pPL and pKK223 

available from Pharmacia, (Piscataway, NJ) . 

A vector for expression of a monoclonal antibody 
of the invention on the surface of a filamentous phage 
particle is a recombinant DNA (rDNA) molecule adapted for 

10 receiving and expressing translatable first and second DNA 

sequences in the form of first and second polypeptides 
wherein one of the polypeptides is fused to a filamentous 
phage coat protein membrane anchor. That is, at least one 
of the polypeptides is a fusion polypeptide containing a 

15 filamentous phage membrane anchor domain, a prokaryotic 

secretion signal domain, and an immunoglobulin heavy or 
light chain variable domain. 

A DNA expression vector for expressing a 
heterodimeric antibody molecule provides a system for 

20 independently cloning (inserting) the two translatable DNA 

sequences into two separate cassettes present in the 
vector, to form two separate cistrons for expressing the 
first and second polypeptides of the antibody molecule, or 
the ligand binding portions of the polypeptides that 

25 comprise the antibody molecule (i.e., the H and L variable 

regions of an immunoglobulin molecule) . The DNA 
expression vector for expressing two cistrons is referred 
to as a dicistronic expression vector. 

The vector comprises a first cassette that 

30 includes upstream and downstream translatable DNA 

sequences operatively linked via a sequence of nucleotides 
adapted for directional ligation to an insert DNA. The 
upstream translatable sequence encodes the secretion 
signal as defined herein. The downstream translatable 

35 sequence encodes the filamentous phage membrane anchor as 

defined herein. The cassette preferably includes DNA 
expression control sequences for expressing the receptor 
polypeptide that is produced when an insert translatable 
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DNA sequence (insert DNA) is direct ionally inserted into 
the cassette via the sequence of nucleotides adapted for 
directional ligation. The filamentous phage membrane 
anchor is preferably a domain of the cpIII or cpVIII coat 
protein capable of binding the matrix of a filamentous 
phage particle, thereby incorporating the fusion 
polypeptide onto the phage surface. 

The receptor expressing vector also contains a 
second cassette for expressing a second receptor 
polypeptide. The second cassette includes a second 
translatable DNA sequence that encodes a secretion signal, 
as defined herein, operatively linked at its 3' terminus 
via a sequence of nucleotides adapted for directional 
ligation to a downstream DNA sequence of the vector that 
typically defines at least one stop codon in the reading 
frame of the cassette. The second translatable DNA 
sequence is operatively linked at its 5' terminus to DNA 
expression control sequences forming the 5' elements. The 
second cassette is capable, upon insertion of a 
translatable DNA sequence (insert DNA) , of expressing the 
second fusion polypeptide comprising a receptor of the 
secretion signal with a polypeptide coded by the insert 
DNA. 

An upstream translatable DNA sequence encodes a 
prokaryotic secretion signal as described earlier. The 
upstream translatable DNA sequence encoding the pelB 
secretion signal is a preferred DNA sequence for inclusion 
in a receptor expression vector. A downstream 
translatable DNA sequence encodes a filamentous phage 
membrane anchor as described earlier. Thus, a downstream 
translatable DNA sequence encodes an amino acid residue 
sequence that corresponds, and preferably is identical, to 
the membrane anchor domain of either a filamentous phage 
gene III or gene VIII coat polypeptide. 

A cassette in a DNA expression vector of this 
invention is the region of the vector that forms, upon 
insertion of a translatable DNA sequence (insert DNA) , a 
sequence of nucleotides capable of expressing, in an 
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appropriate host, a fusion polypeptide. The expression- 
competent sequence of nucleotides is referred to as a 
cistron. Thus, the cassette comprises DNA expression 
control elements operatively linked to the upstream and 
5 downstream translatable DNA sequences. A cistron is 

formed when a translatable DNA sequence is directionally 
inserted (directionally ligated) between the upstream and 
downstream sequences via the sequence of nucleotides 
adapted for that purpose. The resulting three 

10 translatable DNA sequences, namely the upstream, the 

inserted and the downstream sequences, are all operatively 
linked in the same reading frame. 

Thus, a DNA expression vector for expressing an 
antibody molecule provides a system for cloning 

15 translatable DNA sequences into the cassette portions of 

the vector to produce cistrons capable of expressing the 
first and second polypeptides, i.e., the heavy and light 
chains of a monoclonal antibody. 

As used herein, the term "vector" refers to a 

20 nucleic acid molecule capable of transporting between 

different genetic environments another nucleic acid to 
which it has been operatively linked. Preferred vectors 
are those capable of autonomous replication and expression 
of structural gene products present in the DNA segments tp 

25 which they are operatively linked. Vectors, therefore, 

preferably contain the replicons and selectable markers 
described earlier. 

As used herein with regard to DNA sequences or 
segments, the phrase "operatively linked" means the 

30 sequences or segments have been covalently joined, 

preferably by conventional phosphodiester bonds, into one 
strand of DNA, whether in single or double stranded form. 
The choice of vector to which transcription unit or a 
cassette of this invention is operatively linked depends 

35 directly, as is well known in the art, on the functional 

properties desired, e.g., vector replication and protein 
expression, and the host cell to be transformed, these 
being limitations inherent in the art of constructing 
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recombinant DNA molecules. 

A sequence of nucleotides adapted for directional 
ligation, i.e., a polylinker, is a region of the DNA 
expression vector that (1) operatively links for 
replication and transport the upstream and downstream 
translatable DNA sequences and (2) provides a site or 
means for directional ligation of a DNA sequence into the 
vector. Typically, a directional polylinker is a sequence 
of nucleotides that defines two or more restriction 
endonuclease recognition sequences, or restriction sites. 
Upon restriction cleavage, the two sites yield cohesive 
termini to which a translatable DNA sequence can be 
ligated to the DNA expression vector. Preferably, the two 
restriction sites provide, upon restriction cleavage, 
cohesive termini that are non-complementary and thereby 
permit directional insertion of a translatable DNA 
sequence into the cassette. In one embodiment, the 
directional ligation means is provided by nucleotides 
present in the upstream translatable DNA sequence, 
downstream translatable DNA sequence, or both. In another 
embodiment, the sequence of nucleotides adapted for 
directional ligation comprises a sequence of nucleotides 
that defines multiple- directional cloning means. Where 
the sequence of nucleotides adapted for directional 
ligation defines numerous restriction sites, it is 
referred to as a multiple cloning site. 

In a preferred embodiment, a DNA expression vector 
is designed for convenient manipulation in the form of a 
filamentous phage particle encapsulating a genome 
according to the teachings of the present invention. In 
this embodiment, a DNA expression vector further contains 
a nucleotide sequence that defines a filamentous phage 
origin of replication such that the vector, upon 
presentation of the appropriate genetic complementation, 
can replicate as a filamentous phage in single stranded 
replicative form and be packaged into filamentous phage 
particles. This feature provides the ability of the DNA 
expression vector to be packaged into phage particles for 
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subsequent segregation of the particle, and vector 
contained therein, away from other particles that comprise 
a population of phage particles. 

A filamentous phage origin of replication is a 
5 region of the phage genome, as is well known, that defines 

sites for initiation of replication, termination of 
replication and packaging of the replicative form produced 
by replication (see, for example, Rasched et al., 
Microbiol . Rev. , 50:401-427 (1986); andHoriuchi, J. Mol . 

10 Biol. , 188:215-223 (1986)). 

A preferred filamentous phage origin of 
replication for use in the present invention is an M13, fl 
or fd phage origin of replication (Short et al . , Nucl . 
Acids Res. . 16:7583-7600 (1988)). Preferred DNA 

15 expression vectors for cloning and expression a human 

monoclonal antibody of this invention are the dicistronic 
expression vectors pCombS, pComb2-8, pComb3 , pComb2-3 and 
pComb2-3 l , described herein. 

A particularly preferred vector of the present 

20 invention includes a polynucleotide sequence that encodes 

a heavy or light chain variable region of a human 
monoclonal antibody of the present invention. 
Particularly preferred are vectors that include a 
nucleotide sequence that encodes a heavy or light chain 

25 amino acid residue sequence shown in Figures 10-13, that 

encodes a heavy or light chain having the binding 
specificity of those sequences shown in Figures 10-13, or 
that encodes a heavy or light chain having conservative 
substitutions relative to a sequence shown in Figures 10- 

30 13, and complementary polynucleotide sequences thereto. 

Insofar as polynucleotides are component parts of 
a DNA expression vector for producing a human monoclonal 
antibody heavy or light chain immunoglobulin variable 
region amino acid residue sequence, the invention also 

35 contemplates isolated polynucleotides that encode such 

heavy or light chain sequences. 

It is to be understood that, due to the genetic 
code and its attendant redundancies, numerous 
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polynucleotide sequences can be designed that encode a 
contemplated heavy or light chain immunoglobulin variable 
region amino acid residue sequence. Thus, the invention 
contemplates such alternate polynucleotide sequences 
incorporating the features of the redundancy of the 
genetic code. 

Insofar as the expression vector for producing a 
human monoclonal antibody of this invention is carried in 
a host cell compatible with expression of the antibody, 
the invention contemplates a host cell containing a vector 
or polynucleotide of this invention. A preferred host 
cell is E. coli . as described herein. 

E- coli cultures containing preferred expression 
vectors that produce a human monoclonal antibody of this 
invention were deposited pursuant to Budapest Treaty 
requirements with the American Type Culture Collection 
(ATCC) , Rockville, MD, as described herein. 

Examples 

The following examples are intended to illustrate, 
but not limit, the scope of the invention. 

1. Construction of a Dicistronic Expres sion Vector 

for Producing a Heterodimeric Recep tor on Phage 
Particles 

To obtain a vector system for generating a large 
number of Fab antibody fragments that can be screened 
directly, expression libraries in bacteriophage Lambda 
have previously been constructed as described in Huse et 
al., Science . 246:1275-1281 (1989). These systems did not 
contain design features that provide for the expressed Fab 
to be targeted to the surface of a filamentous phage 
particle. 

The main criterion used in choosing a vector 
system was the necessity of generating the largest number 
of Fab fragments which could be screened directly. 
Bacteriophage Lambda was selected as the starting point to 
develop an expression vector for three reasons. First, in 
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vitro packaging of phage DNA was the most efficient method 
of reintroducing DNA into host cells. Second, it was 
possible to detect protein expression at the level of 
single phage plaques. Finally, the screening of phage 
5 libraries typically involved less difficulty with 

nonspecific binding. The alternative, plasmid cloning 
vectors, are only advantageous in the analysis of clones 
after they have been identified. This advantage was not 
lost in the present system because of the use of a 
10 dicistronic expression vector such as pCombVIII, thereby 

permitting a plasmid containing the heavy chain, light 
chain, or Fab expressing inserts to be excised. 

a. Construction of Dicistronic Expression Vector 
pCOMB 

15 1) Preparation of Lambda Zap ™II 

Lambda Zap™ II is a derivative of the 
original Lambda Zap (ATCC Accession No. 40,298) that 
maintains all of the characteristics of the original 
Lambda Zap including 6 unique cloning sites, fusion 

20 protein expression, and the ability to rapidly excise the 

insert in the form of a phagemid (Bluescript SK-), but 
lacks the SAM 100 mutation, allowing growth on many 
Non-Sup F strains, including XLl-Blue. The Lambda Zap™ II 
was constructed as described in Short et al . , Nuc. Acids 

25 Res. , 16:7583-7600, 1988, by replacing the Lambda S gene 

contained in a 4254 base pair (bp) DNA fragment produced 
by digesting Lambda Zap with the restriction enzyme Nco I. 
This 4254 bp DNA fragment was replaced with the 4254 bp 
DNA fragment containing the Lambda S gene isolated from 

30 Lambda gtlO (ATCC # 40,179) after digesting the vector 

with the restriction enzyme Nco I. The 4254 bp DNA 
fragment isolated from lambda gtlO was ligated into the 
original Lambda Zap vector using T4 DNA ligase and 
standard protocols such as those described in Current 

35 Protocols in Molecular Biolocrv , Ausubel et al . f eds., John 

Wiley and Sons, NY, 1987, to form Lambda Zap™ II. 
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2) Preparation of Lambda Hc2 

To express a plurality of V H -coding DNA 
homologs in an E. coli host cell, a vector designated 
Lambda Hc2 was constructed. The vector provided the 
following: the capacity to place the V H -coding DNA 
homologs in the proper reading frame; a ribosome binding 
site as described by Shine et al . , Nature . 254:34 (1975); 
a leader sequence directing the expressed protein to the 
periplasmic space designated the pelB secretion signal; a 
polynucleotide sequence that coded for a known epitope 
(epitope tag) ; and also a polynucleotide that coded for a 
spacer protein between the V H -coding DNA homolog and the 
polynucleotide coding for the epitope tag. Lambda Hc2 has 
been previously described by Huse et al., Science , 
246:1275-1281 (1989) . 

To prepare Lambda Hc2, a synthetic DNA sequence 
containing all of the above features was constructed by 
designing single stranded polynucleotide segments of 20-40 
bases that would hybridize to each other and form the 
double stranded synthetic DNA sequence shown in Figure 1. 
The individual single-stranded polynucleotide segments are 
shown in Table 1. 

Polynucleotides N2 # N3, N9-4, Nil, N10-5, N6 , N7 
and N8 (Table 1) were kinased by adding 1 y.1 of each 
polynucleotide 0.1 micrograms/microliter (fig/til) and 20 
units of T 4 polynucleotide kinase to a solution containing 
70 mM Tris-HCl (Tris [hydroxymethyl] aminomethane 
hydrochloride) at pH 7.6, 10 mM MgCl 2 , 5 mM dithiothreitol 
(DTT) , 10 mM beta-mercaptoethanol , 500 micrograms per 
milliliter (jzg/ml) bovine serum albumin (BSA) . The 
solution was maintained at 37 degrees Centigrade (37°C) 
for 30 minutes and the reaction stopped by maintaining the 
solution at 65°C for 10 minutes. The two end 
polynucleotides, 20 nanograms (ng) of polynucleotides Nl 
and polynucleotides N12, were added to the above kinasing 
reaction solution together with 1/10 volume of a solution 
containing 20 mM Tris-HCl at pH 7.4, 2 . 0 mM MgCl 2 and 50 mM 
NaCl. This solution was heated to 70°C for 5 minutes and 
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allowed to cool to room temperature, approximately 25°C, 
over 1.5 hours in a 500 ml beaker of water. During this 
time period all 10 polynucleotides annealed to form the 
double stranded synthetic DNA insert shown in Figure 1. 
5 The individual polynucleotides were covalently linked to 

each other to stabilize the synthetic DNA insert by adding 
40 /il of the above reaction to a solution containing 50 mM 
Tris-HCl at pH 7.5, 7 mM MgCl 2 , 1 mM DTT, 1 mM adenosine 
triphosphate (ATP) and 10 units of T4 DNA ligase. This 

10 solution was maintained at 37°C for 30 minutes and then 

the T4 DNA ligase was inactivated by maintaining the 
solution at 65°C for 10 minutes. The end polynucleotides 
were kinased by mixing 52 pi of the above reaction, 4 /xl 
of a solution containing 10 mM ATP and 5 units of T4 

15 polynucleotide kinase. This solution was maintained at 

37 °C for 30 minutes and then the T4 polynucleotide kinase 
was inactivated by maintaining the solution at 65°C for 10 
minutes . 

20 



Table 1 



SEQ 








ID NO 








(15) 


Nl) 


5' 


GGCCGCAAATTCTATTTCAAGGAGACAGTCAT 3' 


(16) 


N2) 


5' 


AATGAAATACCTATTGCCTACGGCAGCCGCTGGATT 3 ' 


(17) 


N3) 


5' 


GTTATTACTCGCTGCCCAACCAGCCATGGCCC 3 1 


(18) 


N6) 


5' 


CAGTTTCACCTGGGCCATGGCTGGTTGGG 3 1 


(19) 


N7) 


5' 


CAGCGAGTAATAACAATCCAGCGGCTGCCGTAGGCAATAG 3 


(20) 


N8) 


5 f 


GTATTTCATTATGACTGTCTCCTTGAAATAGAATTTGC 3 ■ 


(21) 


N9-4) 


5' 


AGGTGAAACTGCT CGAGATTTCTAGACTAGTTAC CCGTAC 3 


(22) 


N10-5) 


5' 


CGGAACGTCGTACGGGTAACTAGTCTAGAAATCTCGAG 3 ■ 


(23) 


Nil) 


5' 


GACGTTCCGGACTACGGTTCTTAATAGAATTCG 3 ' 


(24) 


N12) 


5' 


TCGACGAATTCTATTAAGAACCGTAGTC 3 ' 



35 
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The completed synthetic DNA insert was ligated 
directly into the Lambda Zap™ II vector described 
in Example lal) that had been previously digested 
with the restriction enzymes, Not I and Xho I. The 
ligation mixture was packaged according to the 
manufacture's instructions using Gigapack II Gold 
packing extract available from Stratagene, La 
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Jolla, California. The packaged ligation mixture 
was plated on XLl-Blue cells (Stratagene) . 
Individual lambda plaques were cored and the 
inserts excised according to the in vivo excision 
5 protocol for Lambda Zap™ II provided by the 

manufacturer (Stratagene) . This in vivo excision 
protocol moved the cloned insert from the Lambda 
Hc2 vector into a phagemid vector to allow easy for 
manipulation and sequencing. The accuracy of the 

10 above cloning steps was confirmed by sequencing the 

insert using the Sanger dideoxy method described in 
by Sanger et al., Proc. Natl. Acad. Sci . . USA, 
74:5463-5467 (1977) and using the manufacture's 
instructions in the AMV Reverse Transcriptase 

15 35 S-ATP sequencing kit (Stratagene) . The sequence 

of the resulting double-stranded synthetic DNA 
insert in the V H expression vector (Lambda Hc2) is 
shown in Figure 1. The sequence of each strand 
(top and bottom) of Lambda Hc2 is listed in the 

20 Sequence Listing as SEQ ID NO 1 and SEQ ID NO 2, 

respectively. The resultant Lambda Hc2 expression 
vector is shown in Figure 2. 

3) Preparati on of Lambda Lc2 
25 To express a plurality of V L -coding 

DNA homologs in an E . coli host cell, a vector 
designated Lambda Lc2 was constructed having the 
capacity to place the V u - coding DNA homologs in the 
proper reading frame, provided a ribosome binding 
30 site as described by Shine et al . , Nature?, 254:34 

(1975) , provided the pelB gene leader sequence 
secretion signal that has been previously used to 
successfully secrete Fab fragments in E. coli by 
Lei et al . , J. Bac , 169:4379 (1987) and Better et 
35 al., Science , 240:1041 (1988), and also provided a 

polynucleotide containing a restriction 
endonuclease site for cloning. Lambda Lc2 has been 
previously described by Huse et al . , Science , 
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246:1275-1281 (1989) . 

A synthetic DNA sequence containing all of the 
above features was constructed by designing single 
stranded polynucleotide segments of 20-60 bases 
that would hybridize to each other and form the 
double stranded synthetic DNA sequence shown in 
Figure 3 . The sequence of each individual 
single-stranded polynucleotide segment (Ol-08) 
within the double stranded synthetic DNA sequence 
is shown in Table 2. 

Polynucleotides 02, 03, 04, 05, 06 and 07 
(Table 2) were kinased by adding 1 fil (0.1 fig/fil) 
of each polynucleotide and 20 units of T 4 
polynucleotide kinase to a solution containing 70 
mM Tris-HCl at pH 7.6, 10 mM MgCl 2 , 5 mM DTT, 10 mM 
beta-mercaptoethanol, 500 /ig/ml of BSA. The 
solution was maintained at 37°C for 30 minutes and 
the reaction stopped by maintaining the solution at 
65°C for 10 minutes. The 20 ng each of the two end 
polynucleotides, 01 and 08, were added to the above 
kinasing reaction solution together with 1/10 
volume of a solution containing 20.0 mM Tris-HCl at 
pH 7.4, 2.0 mM MgCl 2 and 15.0 mM sodium chloride 
(NaCl) . This solution was heated to 70 °C for 5 
minutes and allowed to cool to room temperature, 
approximately 25° C, over 1.5 hours in a 500 ml 
beaker of water. During this time period all 8 
polynucleotides annealed to form the double 
stranded synthetic DNA insert shown in Figure 3 . 
The individual polynucleotides were covalently 
linked to each other to stabilize the synthetic DNA 
insert by adding 40 pi of the above reaction to a 
solution containing 50 mM Tris-HCl at pH 7.5, 7 mM 
MgCl 2 , 1 mM DTT, 1 mM ATP and 10 units of T4 DNA 
ligase. This solution was maintained at 37°C for 
30 minutes and then the T4 DNA ligase was 
inactivated by maintaining the solution at 65 °C for 
10 minutes. The end polynucleotides were kinased 
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by mixing 52 jxl of the above reaction, 4 |il of a 
solution containing 10 mM ATP and 5 units of T4 
polynucleotide kinase. This solution was 
maintained at 37°C for 30 minutes and then the T4 
polynucleotide kinase was inactivated by 
maintaining the solution at 65°C for 10 minutes. 



Table 2 



SEQ 








ID NO 








(25) 


01) 


5' 


TGAATTCTAAACTAGTCGCCAAGGAGACAGTCAT 3 • 


(26) 


02) 


5' 


AATGAAATACCTATTGCCTACGGCAGCCGCTGGATT 3 ' 


(27) 


03) 


5' 


GTTATTACTCGCTGCCCAACCAGCCATGGCC 3 ■ 


(28) 


04) 


5' 


GAGCTCGT CAGTTCTAGAGTTAAG CGGCCG 3 ' 


(29) 


05) 


5' 


GTATTTCATTATGACTGTCTCCTTGGCGACTAGTTTAGAA - 
TTCAAGCT 3' 


(30) 


06) 


5 • 


CAGCG AGTAATAACAATC CAGCGGCTG CCGTAGGCAATAG 


(3D 


07) 


5' 


TGACGAGCTCGGCCATGGCTGGTTGGG 3 1 


(32) 


08) 


5' 


TCGACGGCCGCTTAACTCTAGAAC 3 ' 



The completed synthetic DNA insert was ligated 
directly into the Lambda Zap™ II vector described 
in Example lal) that had been previously digested 
with the restriction enzymes Sac I and Xho I. The 
ligation mixture was packaged according to the 
manufacture's instructions using Gigapack II Gold 
packing extract (Stratagene) . The packaged 
ligation mixture was plated on XLl-Blue cells 
(Stratagene) . Individual lambda plaques were cored 
and the inserts excised according to the j.n vd,vQ 
excision protocol for Lambda Zap™ II provided by 
the manufacturer (Stratagene) . This i,n viyp 
excision protocol moved the cloned insert from the 
Lambda Lc2 vector into a plasmid phagemid vector 
allow for easy manipulation and sequencing. The 
accuracy of the above cloning steps was confirmed 



by sequencing the insert using the manufacture's 
instructions in the AMV Reverse Transcriptase 
35 S-dATP sequencing kit (Stratagene) . The sequence 
of the resulting Lc 2 expression vector (Lambda Lc2) 
is shown in Figure 3. Each strand is separately 
listed in the Sequence Listing as SEQ ID NO 3 and 
SEQ ID NO 4 . The resultant Lc2 vector is 
schematically diagrammed in Figure 4. 

A preferred vector for use in this invention, 
designated Lambda Lc3, is a derivative of Lambda 
Lc2 prepared above. Lambda Lc2 contains a Spe I 
restriction site located 3 • to the EcoR I 
restriction site and 5' to the Shine -Dalgarno 
ribosome binding site as shown in the sequence in 
Figure 3 and in SEQ ID NO 3 . A Spe I restriction 
site is also present in Lambda Hc2 as shown in 
Figures 1 and 2 and in SEQ ID NO 1. A 
combinatorial vector, designated pComb, was 
constructed by combining portions of Lambda Hc2 and 
Lc2 together as described in Example la4) below. 
The resultant combinatorial pComb vector contained 
two Spe I restriction sites, one provided by Lambda 
Hc2 and one provided by Lambda Lc2, with an EcoR I 
site in between. Despite the presence of two Spe I 
restriction sites, DNA homologs having Spe I and 
EcoR I cohesive termini were successfully 
directionally ligated into a pComb expression 
vector previously digested with Spe I and EcoR I as 
described in Example lb below. The proximity of 
the EcoR I restriction site to the 3' Spe I site, 
provided by the Lc2 vector, inhibited the complete 
digestion of the 3' Spe I site. Thus, digesting 
pComb with Spe I and EcoR I did not result in 
removal of the EcoR I site between the two Spe I 
sites . 

The presence of a second Spe I restriction 
site may be undesirable for ligations into a pComb 
vector digested only with Spe I as the region 
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between the two sites would be eliminated. 
Therefore, a derivative of Lambda Lc2 lacking the 
second or 3 1 Spe I site, designated Lambda Lc3, was 
produced by first digesting Lambda Lc2 with Spe I 
to form a linearized vector. The ends were filled 
in to form blunt ends which are ligated together to 
result in Lambda Lc3 lacking a Spe I site. Lambda 
Lc3 is a preferred vector for use in constructing a 
combinatorial vector as described below. 

4) Preparation of pComb 

Phagemids were excised from the 
expression vectors Lambda Hc2 or Lambda Lc2 using 
an in vivo excision protocol described above. 
Double stranded DNA was prepared from the 
phagemid- containing cells according to the methods 
described by Holmes et al., Anal. Biochem. . 114:193 
(1981) . The phagemids resulting from in vivo 
excision contained the same nucleotide sequences 
for antibody fragment cloning and expression as did 
the parent vectors, and are designated phagemid Hc2 
and Lc2, corresponding to Lambda Hc2 and Lc2, 
respectively. 

For the construction of combinatorial phagemid 
vector pComb, produced by combining portions of 
phagemid Hc2 and phagemid Lc2, phagemid Hc2 was 
first digested with Sac I to remove the restriction 
site located 5* to the LacZ promoter. The 
linearized phagemid was then blunt ended with T4 
polymerase and ligated to result in a Hc2 phagemid 
lacking a Sac I site. The modified Hc2 phagemid 
and the Lc2 phagemid were then separately 
restriction digested with Sea I and EcoR I to 
result in a Hc2 fragment having from 5 • to 3 1 Sea 
I, Not I, Xho I, Spe I and EcoR I restriction sites 
and a Lc2 fragment having from 5* to 3' EcoR I, Sac 
I, Xba I and Sac I restriction sites. The 
linearized phagemids were then ligated together at 
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their respective cohesive ends to form pComb, a 
circularized phagemid having a linear arrangement 
of restriction sites of Not I, Xho I, Spe I, EcoR 
I , Sac I , Xba I , Not I , Apa I and Sea I . The 
ligated phagemid vector was then inserted into an 
appropriate bacterial host and transformants were 
selected on the antibiotic ampicillin. 

Selected ampicillin resistant transformants 
were screened for the presence of two Not I sites. 
The resulting ampicillin resistant combinatorial 
phagemid vector was designated pComb, the schematic 
organization of which is shown in Figure 5. The 
resultant combinatorial vector, pComb, consisted of 
a DNA molecule having two cassettes to express two 
fusion proteins and having nucleotide residue 
sequences for the following operatively linked 
elements listed in a 5' to 3 1 direction: a first 
cassette consisting of an inducible LacZ promoter 
upstream from the LacZ gene; a Not I restriction 
site; a ribosome binding site; a pelB leader; a 
spacer; a cloning region bordered by a 5 ' Xho and 
3' Spe I restriction site; a decapeptide tag 
followed by expression control stop sequences; an 
EcoR I restriction site located 5' to a second 
cassette consisting of an expression control 
ribosome binding site; a pelB leader; a spacer 
region; a cloning region bordered by a 5' Sac I and 
a 3 • Xba I restriction site followed by expression 
control stop sequences and a second Not I 
restriction site. 

A preferred combinatorial vector for use in 
this invention, designated pComb2, is constructed 
by combining portions of phagemid Hc2 and phagemid 
Lc3 as described above for preparing pComb. The 
resultant combinatorial vector, pComb2, consists of 
a DNA molecule having two cassettes identical to 
pComb to express two fusion proteins identically to 
pComb except that a second Spe I restriction site 
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in the second cassette is eliminated. 

b. Construction of the pComblll Vector for 
Expressing Fusion Proteins Having a 
Bacteriophage Coat Protein Membrane 
Anchor 

Because of the multiple endonuclease 
restriction cloning sites, the pComb phagemid 
expression vector prepared above is a useful 
cloning vehicle for modification for the 
preparation of an expression vector for use in this 
invention. To that end, pComb was digested with 
EcoR I and Spe I followed by phosphatase treatment 
to produce linearized pComb. 

1) Preparation of pComblll 

A separate phagemid expression 
vector was constructed using sequences encoding 
bacteriophage cpIII membrane anchor domain. A PCR 
product defining the DNA sequence encoding the 
filamentous phage coat protein, cpIII, membrane 
anchor containing a LacZ promotor region sequence 
3 ' to the membrane anchor for expression of the 
light chain and Spe I and EcoR I cohesive termini 
was prepared from M13mpl8, a commercially available 
bacteriophage vector (Pharmacia, Piscataway, New 
Jersey) . 

To prepare a modified cpIII, replicative form 
DNA from M13mpl8 was first isolated. Briefly, into 
2 ml of LB (Luria-Bertani medium) , 50 /xl of a 
culture of a bacterial strain carrying an F 1 
episome (JM107, JM109 or TGI) was admixed with a 
one tenth suspension of bacteriophage particles 
derived from a single plaque. The admixture was 
incubated for 4 to 5 hours at 37°C with constant 
agitation. The admixture was then centrifuged at 
12,000 x g for 5 minutes to pellet the infected 
bacteria. After the supernatant was removed, the 



pellet was resuapended by vigorous vortexing in 100 
/il of ice-cold solution I. Solution I was prepared 
by admixing 50 mM glucose, 10 mM EDTA (disodium 
ethylenediaminetetraacetic acid) and 25 mM Tris-HCl 
at pH 8.0, and autoclaving for 15 minutes. 

To the bacterial suspension, 200 /il of freshly 
prepared Solution II was admixed and the tube was 
rapidly inverted five times. Solution II was 
prepared by admixing 0.2 N NaOH and 1% SDS. To the 
bacterial suspension, 150 fil of ice-cold Solution 
III was admixed and the tube was vortexed gently in 
an inverted position for 10 seconds to disperse 
Solution III through the viscous bacterial lysate. 
Solution III was prepared by admixing 60 ml of 5 M 
potassium acetate, 11.5 ml of glacial acetic acid 
and 28.5 ml of water. The resultant bacterial 
lysate was then stored on ice for 5 minutes 
followed by centrif ugation at 12,000 x g for 5 
minutes at 4°C in a microfuge. The resultant 
supernatant was recovered and transferred to a new 
tube. To the supernatant was added an equal volume 
of phenol/chloroform and the admixture was 
vortexed. The admixture was then centrif uged at 
12,000 x g for 2 minutes in a microfuge. The 
resultant supernatant was transferred to a new tube 
and the double- stranded bacteriophage DNA was 
precipitated with 2 volumes of ethanol at room 
temperature. After allowing the admixture to stand 
at room temperature for 2 minutes, the admixture 
was centrif uged to pellet the DNA. The supernatant 
was removed and the pelleted replicative form DNA 
was resuspended in 25 fil of Tris-HCl at pH 7.6, and 
10 mM EDTA (TE) . 

An alternative Lac-B primer for use in 
constructing the cpIII membrane anchor and LacZ 
promotor region was Lac-B' as shown in Table 3. 
The amplification reactions were performed as 
described above with the exception that in the 
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second PCR amplification, Lac-B ' was used with 
Lac-F instead of Lac-B. The product from the 
amplification reaction is listed in the sequence 
listing as SEQ ID NO 41 from nucleotide position 1 
to nucleotide position 172. The use of Lac-B' 
resulted in a LacZ region lacking 29 nucleotides on 
the 3' end but was functionally equivalent to the 
longer fragment produced with the Lac-F and Lac-B 
primers . 

The products of the first and second PCR 
amplifications using the primer pairs G-3(F) and 
G-3(B) and Lac-F and Lac-B were then recombined at 
the nucleotides corresponding to cpIII membrane 
anchor overlap and Nhe I restriction site and 
subjected to a second round of PCR using the G-3(F) 
(SEQ ID NO 35) and Lac-B (SEQ ID NO 38) primer pair 
to form a recombined PCR DNA fragment product 
consisting of the following: a 5' Spe I restriction 
site; a cpIII DNA membrane anchor domain beginning 
at the nucleotide residue sequence which 
corresponds to the amino acid residue 198 of the 
entire mature cpIII protein; an endogenous stop 
site provided by the membrane anchor at amino acid 
residue number 112; a Nhe I restriction site, a 
LacZ promoter, operator and Cap-binding site 
sequence; and a 3* EcoR I restriction site. 

To construct a phagemid vector for the 
coordinate expression of a heavy chain-cpIII fusion 
protein as prepared in Example 2 with kappa light 
chain, the recombined PCR modified cpIII membrane 
anchor domain DNA fragment was then restriction 
digested with Spe I and EcoR I to produce a DNA 
fragment for directional ligation into a similarly 
digested pComb2 phagemid expression vector having 
only one Spe I site prepared in Example la4) to 
form a pComb2-III (also referred to as pComb2-III) 
phagemid expression vector. Thus, the resultant 
ampicillin resistance conferring pComb2-3 vector, 



having only one Spe I restriction site, contained 
separate LacZ promoter/operator sequences for 
directing the separate expression of the heavy 
chain (Fd)-cpIII fusion product and the light chain 
protein. The expressed proteins were directed to 
the periplasmic space by pelB leader sequences for 
functional assembly on the membrane. Inclusion of 
the phage Fl intergenic region in the vector 
allowed for packaging of single stranded phagemid 
with the aid of helper phage. The use of helper 
phage superinfection lead to expression of two 
forms of cpIII. Thus, normal phage morphogenesis 
was perturbed by competition between the Fab-cpIII 
fusion and the native cpIII of the helper phage for 
incorporation into the virion for Fab-cpVIII 
fusions. In addition, also contemplated for use in 
this invention are vectors conferring 
chloramphenicol resistance and the like. 

A more preferred phagemid expression vector 
for use in this invention having additional 
restriction enzyme cloning sites, designated 
pComb-III' or pComb2-3 ! , was prepared as described 
above for pComb2-3 with the addition of a 51 base 
pair fragment from pBluescript as described by 
Short et al., Nuc. Acids Res . . 16:7583-7600 (1988) 
and commercially available from Stratagene. To 
prepare pComb2-3 , / pComb2-3 was first digested with 
Xho I and Spe I restriction enzymes to form a 
linearized pComb2-3. The vector pBluescript was 
digested with the same enzymes releasing a 51 base 
pair fragment containing the restriction enzyme 
sites Sal I, Acc I, Hinc II, Cla I, Hind III, EcoR 
V, Pst I, Sma I and BamH I. The 51 base pair 
fragment was ligated into the linearized pComb2-3 
vector via the cohesive Xho I and Spe I termini to 
form pComb2-3 • . 
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Table 3 

SEQ 

ID NO Primer 

(35) 1 G-3 (F) 5' GAGACG ACTAGTGGTGGCGGTGGCTCTCCATTC 
5 GTTTGTGAATATCAA 3' 

(36) 2 G-3 (B) 5' TTACT AGCTAGCATAATAACGGAATACCCAAAA 

GAACTGG 3 ' 

(37) 3 LAC-F 5' TATG CTAGCTAGTAA CACGACAGGTTTCC CGAC 

TGG 3' 

10 (38) 4 LAC-B 5; ACCGAGCTCSAAH£GTAATCATGGTC 3' 

(39)5 LAC - B 1 5 1 AGCTGT TGAATTCG TGAAATTGTTATCCGCT 3 ' 



F Forward Primer 
15 B Backward Primer 

1 From 5* to 3 1 : Spe I restriction site sequence is 
single underlined; the overlapping sequence with 
the 5' end of cpIII is double underlined 

2 From 5 r to 3': Nhe I restriction site sequence is 
20 single underlined; the overlapping sequence with 3* 

end of cpIII is double underlined. 

3 From 5 1 to 3 1 : overlapping sequence with the 3 * 
end of cpIII is double underlined; Nhe I 
restriction sequence begins with the nucleotide 

25 residue "G" at position 4 and extends 5 more 

residues = GCTAGC. 

4 EcoR I restriction site sequence is single 
underlined. 

5 Alternative backwards primer for amplifying LacZ; 
30 EcoR I restriction site sequence is single 

underlined. 



35 , 2. Isolation of HIV- 1 * Specific Monoclonal 

Antibodies Produced from the Dicistronic 

Expression Vector. pComb2-3 

In practicing this invention, the heavy (Fd 
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consisting of V H and C H 1) and light (kappa) chains 
(V L , C L ) of antibodies are first targeted to the 
periplasm of E. coli for the assembly of 
heterodimeric Fab molecules. In order to obtain 
expression of antibody Fab libraries on a phage 
surface, the nucleotide residue sequences encoding 
either the Fd or light chains must be operatively 
linked to the nucleotide residue sequence encoding 
a filamentous bacteriophage coat protein membrane 
anchor. A coat protein for use in this invention 
in providing a membrane anchor is III (cpIII or 
cp3) . In the Examples described herein, methods 
for operatively linking a nucleotide residue 
sequence encoding a Fd chain to a cpIII membrane 
anchor in a fusion protein of this invention are 
described. 

In a phagemid vector, a first and second 
cistron consisting of translatable DNA sequences 
are operatively linked to form a dicistronic DNA 
molecule. Each cistron in the dicistronic DNA 
molecule is linked to DNA expression control 
sequences for the coordinate expression of a fusion 
protein, Fd-cpIII, and a kappa light chain. 

The first cistron encodes a periplasmic 
secretion signal (pelB leader) operatively linked 
to the fusion protein, Fd-cpIII. The second 
cistron encodes a second pelB leader operatively 
linked to a kappa light chain. The presence of the 
pelB leader facilitates the coordinated but 
separate secretion of both the fusion protein and 
light chain from the bacterial cytoplasm into the 
periplasmic space. 

In this process, the phagemid expression 
vector carries an ampicillin selectable resistance 
marker gene (beta lactamase or bla) in addition to 
the Fd-cpIII fusion and the kappa chain. The fl 
phage origin of replication facilitates the 
generation of single stranded phagemid. The 



- 78 - 

isopropyl thiogalactopyranoside (IPTG) induced 
expression of a dicistronic message encoding the 
Fd-cpIII fusion (V H , C H1 , cpIII) and the light chain 
(V L/ C L ) leads to the formation of heavy and light 
chains. Each chain is delivered to the periplasmic 
space by the pelB leader sequence, which is 
subsequently cleaved. The heavy chain is anchored 
in the membrane by the cpIII membrane anchor domain 
while the light chain is secreted into the 
periplasm. The heavy chain in the presence of 
light chain assembles to form Fab molecules. This 
same result can be achieved if, in the alternative, 
the light chain is anchored in the membrane via a 
light chain fusion protein having a membrane anchor 
and heavy chain is secreted via a pelB leader into 
the periplasm. 

With subsequent infection of E. coli with a 
helper phage, as the assembly of the filamentous 
bacteriophage progresses, the coat protein III is 
incorporated on the tail of the bacteriophage. 

a. Preparation of Lymphocyte RNA 

Five milliliters of bone marrow was 
removed by aspiration from HIV-1 asymptomatic 
seropositive individuals. Total cellular RNA was 
prepared from the bone marrow lymphocytes as 
described above using the RNA preparation methods 
described by Chomczynski et al . , Anal p^ocpfrem,, # 
162:156-159 (1987) and using the RNA isolation kit 
(Stratagene) according to the manufacturer's 
instructions. Briefly, for immediate 
homogenization of the cells in the isolated bone 
marrow, 10 ml of a denaturing solution containing 
3.0 M guanidinium isothiocyanate containing 71 /xl 
of beta-mercaptoethanol was admixed to the isolated 
bone marrow. One ml of sodium acetate at a 
concentration of 2 M at pH 4.0 was then admixed 
with the homogenized cells. One ml of phenol that 
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had been previously saturated with HjO was also 
admixed to the denaturing solution containing the 
homogenized spleen. Two ml of a chloroform: isoamyl 
alcohol (24:1 v/v) mixture was added to this 
homogenate. The homogenate was mixed vigorously 
for ten seconds and maintained on ice for 15 
minutes. The homogenate was then transferred to a 
thick-walled 50 ml polypropylene centrifuged tube 
(Fisher Scientific Company, Pittsburgh, PA) . The 
solution was centrifuged at 10,000 x g for 20 
minutes at 4°C. The upper RNA-containing aqueous 
layer was transferred to a fresh 50 ml 
polypropylene centrifuge tube and mixed with an 
equal volume of isopropyl alcohol. This solution 
was maintained at -20°C for at least one hour to 
precipitate the RNA. The solution containing the 
precipitated RNA was centrifuged at 10,000 x g for 
twenty minutes at 4°C. The pelleted total cellular 
RNA was collected and dissolved in 3 ml of the 
denaturing solution described above. Three ml of 
isopropyl alcohol was added to the re -suspended 
total cellular RNA and vigorously mixed. This 
solution was maintained at -20°C for at least 1 
hour to precipitate the RNA. The solution 
containing the precipitated RNA was centrifuged at 
10,000 x g for ten minutes at 4°C. The pelleted 
RNA was washed once with a solution containing 75% 
ethanol. The pelleted RNA was dried under vacuum 
for 15 minutes and then re-suspended in dimethyl 
pyrocarbonate- treated (DEPC-H 2 0) HgO . 

Messenger RNA (mRNA) enriched for sequences 
containing long poly A tracts was prepared from the 
total cellular RNA using methods described in 
Molecular Cloning: A Laboratory Manual , Maniatis et 
al., eds., Cold Spring Harbor, NY, (1982). 
Briefly, one half of the total RNA isolated from a 
single donor prepared as described above was 
resuspended in one ml of DEPC-H 2 0 and maintained at 
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65°C for five minutes. One ml of 2X high salt 
loading buffer consisting of 100 mM Tris-HCl, 1 M 
NaCl, 2.0 mM EDTA at pH 7.5, and 0.2% SDS was 
admixed to the resuspended RNA and the mixture 
allowed to cool to room temperature. 

The total purified mRNA was then used in PCR 
amplification reactions as described in Example 2c. 
Alternatively, the mRNA was further purified to 
poly A+ RNA by the following procedure. The total 
MRNA was applied to an oligo-dT (Collaborative 
Research Type 2 or Type 3) column that was 
previously prepared by washing the oligo-dT with a 
solution containing 0.1 M sodium hydroxide and 5 mM 
EDTA and then equilibrating the column with 
DEPC-HjO. The eluate was collected in a sterile 
polypropylene tube and reapplied to the same column 
after heating the eluate for 5 minutes at 65°C. 
The oligo-dT column was then washed with 2 ml of 
high salt loading buffer consisting of 50 mM 
Tris-HCl at pH 7.5, 500 mM sodium chloride, 1 mM 
EDTA at pH 7.5 and 0.1% SDS. The oligo dT column 
was then washed with 2 ml of IX medium salt buffer 
consisting of 50 mM Tris-HCl, pH 7.5, 100 mM, 1 mM 
EDTA and 0.1% SDS. The messenger RNA was eluted 
from the oligo-dT column with 1 ml of buffer 
consisting of 10 mM Tris-HCl at pH 7.5, 1 mM EDTA 
at pH 7.5, and 0.05% SDS. The messenger RNA was 
purified by extracting this solution with 
phenol /chloroform followed by a single extraction 
with 100% chloroform. The messenger RNA was 
concentrated by. ethanol precipitation and 
resuspended in DEPC H 2 0. 

The resultant purified mRNA contained a 
plurality of anti-HIV encoding V H and V L sequences 
for preparation of an anti-HIV-1 Fab DNA library. 



- 81 - 

b. Construction of a Combinatorial HIV-1 
Antibody Library 

1) Selection of Oligonucleotide Primers 
The nucleotide sequences encoding 
the immunoglobulin protein CDR's are highly 
variable. However, there are several regions of 
conserved sequences that flank the V region domains 
of either the light or heavy chain, for instance, 
and that contain substantially conserved nucleotide 
sequences, i.e., sequences that will hybridize to 
the same primer sequence. Therefore, 
polynucleotide synthesis (amplification) primers 
that hybridize to the conserved sequences and 
incorporate restriction sites into the DNA homolog 
produced that are suitable for operatively linking 
the synthesized DNA fragments to a vector were 
constructed. More specifically, the primers were 
designed so that the resulting DNA homologs 
produced can be inserted into an expression vector 
of this invention in reading frame with the 
upstream translatable DNA sequence at the region of 
the vector containing the directional ligation 
means . 

For amplification of the V H domains, primers 
were designed to introduce cohesive termini 
compatible with directional ligation into the 
unique Xho I and Spe I sites of the pComb2-3 
expression vector. In all cases, the 5" primers 
VHla (5 1 CAGGTGCAG CTCG AG CAGT CTGGG 3' SEQ ID NO 42) 
and VH3a (5' GAGGTGCAG£1£G£GGAGTCTGGG 3' SEQ ID NO 
43) were designed to maximize homology with the V H l 
and V H 3 subgroup families, respectively, although 
considerable cross -priming of other subgroups was 
expected. The Xho I restriction site for cloning 
into the pComb2-3 vector is underlined. The 3* 
primer CGlz having the nucleotide sequence 5* 
G CATG TACTAGT TTTGTCAC AAGATTTGGG 3* (SEQ ID NO 44) 
used in conjunction with the 5' primers is the 
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primer for the heavy chain corresponding to part of 
the hinge region. The Spe I site for cloning into 
the pComb2-3 vector is underlined. 

The nucleotide sequences encoding the V L domain 
5 are highly variable. However, there are several 

regions of conserved sequences that flank the V L 
domains including the J L , V L framework regions and 
V L leader/promotor . Therefore, amplification 
primers were constructed that hybridized to the 
10 conserved sequences and incorporate restriction 

sites that allow cloning the amplified fragments 
into the pComb2-3 expression vector cut with Sac I 
and Xba I . 

For amplification of the kappa V L domains 
15 analogous to the heavy chain primers listed above, 

the 5' primers, VKla (5* GACATCGAGCTCACCCAGTCTCCA 
3' SEQ ID NO 45) and VK3a (5* 

GAAAT TGAGCTC ACGCAGTCTCCA 3' SEQ ID NO 46), were 
used. These primers also introduced a Sac I 

20 restriction endonuclease site indicated by the 

underlined nucleotides to allow the V L DNA homolog 
to be cloned into the pComb2-3 expression vector. 
The 3' V L amplification primer, CKla having a 
nucleotide sequence 5' 

2 5 GCGCC GTCTAGA ACTAACACTCTCCCCTGTTGAAGCTCTTTGTGACGGGCA 
AG 3' (SEQ ID NO 47) corresponding to the 3' end of 
the light chain was used to amplify the light chain 
while incorporating the underlined Xba I 
restriction endonuclease site required to insert 

30 the V L DNA homolog into the pComb2-3 expression 

vector. 

All primers and synthetic polynucleotides 
described herein, were either purchased from 
Research Genetics in Huntsville, Alabama or 
35 synthesized on an Applied Biosystems DNA 

synthesizer, model 381A, using the manufacturer's 
instruction. 
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2) PCR Amplification of V M and V L DNA 
Homplpqs 

In preparation for PCR 
amplification, mRNA prepared above was used as a 
template for cDNA synthesis by a primer extension 
reaction. First, 20-50 jig of total mRNA in water 
was first hybridized (annealed) at 70°C for 10 
minutes with 600 ng (60.0 pmol) of either the heavy 
or light chain 3' primers listed above. 
Subsequently, the hybridized admixture was used in 
a typical 50 pi reverse transcription reaction 
containing 200 fiM each of dATP, dCTP, dGTP and 
dTTP, 40 mM Tris-HCl at pH 8.0, 8 mM MgCl 2 , 50 mM 
NaCl, 2 mM spermidine and 600 units of reverse 
transcriptase (Superscript, BRL) . The reaction 
admixture was then maintained for one hour at 37°C 
to form an RNA-cDNA admixture. 

Three pi of the resultant RNA-cDNA admixture 
was then used in PCR amplification in a reaction 
volume of 100 pi containing a mixture of all four 
dNTPs at a concentration of 60 jzM, 50 mM KCl, 10 mM 
Tris-HCl at pH 8.3, 15 mM MgCl 2 , 0.1% gelatin and 5 
units of Thermus aquaticus (Taq) DNA polymerase 
(Perkin-Elmer-Cetus, Emeryville, California) , and 
60 pmol of the appropriate 5' and 3' primers listed 
above. The separate reaction admixtures were 
overlaid with mineral oil and subjected to 35 
cycles of amplification. Each amplification cycle 
included denaturation at 91 °C for 1 minute, 
annealing at 52°C for 2 minutes and polynucleotide 
synthesis by primer extension (elongation) at 72 °C 
for 1.5 minutes, followed by a final maintenance 
period of 10 minutes at 72°C. An aliquot of the 
reaction admixtures were then separately 
electrophoresed on a 2% agarose gel. After 
successful amplification as determined by gel 
electrophoretic migration, the remainder of the 
RNA-cDNA was amplified after which the PCR products 
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of a common 3 ' primer were pooled into separate 
V H -and V L -coding DNA homolog- containing samples and 
were then extracted twice with phenol /chloroform, 
once with chloroform, ethanol precipitated and were 
stored at -70°C in 10 mM Tris-HCl at pH 7.5, and 1 
mM EDTA. 

3) Insertion of V H and V L -Codina DNA 
Homologs into pComb2-3 Expression 
Vector 

The V H -coding DNA homologs (heavy 
chain) prepared above were then digested with an 
excess of Xho I and Spe I for subsequent ligation 
into a similarly digested and linearized pComb2-3 
in a total volume of 150 /xl with 10 units of ligase 
at 16°C overnight. The construction of the library 
was performed as described by Burton et al . , Proc . 
Natl. Acad. Sci.. USA . 88:10134-10137 (1991). 
Briefly, following ligation, the pComb2-3 vector 
containing heavy chain DNA was then transformed by 
electroporation into 300 pi of XLl-Blue cells. 
After transformation and culturing, library size 
was determined by plating aliquots of the culture. 
Typically the library had about 10 7 members. An 
overnight culture was then prepared from which 
phagemid DNA containing the heavy chain library was 
prepared. 

For the cloning of the V L - coding DNA homologs 
(light chain) , 10 jig of phagemid DNA containing the 
heavy chain library was then digested with Sac I 
and Sbal. The resulting linearized vector was 
treated with phosphatase and purified by agarose 
gel electrophoresis. The desired fragment, 4.7 kb 
in length, was excised from the gel. Ligation of 
this vector with prepared light chain PCR DNA 
proceeded as described above for heavy chain. A 
library of approximately 10 7 members having heavy 
chain fragments operatively linked to the cpIII 
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anchor sequence (Fd-cpIII) and light chain 
fragments was thus produced. 

4) Preparation of Phaoe Expressing Fab 
Heterodimers 

Following transformation of the 
resultant library produced above into XLl-Blue 
cells, phage were prepared to allow for isolation 
of HIV-l specific Fabs by panning on target 
antigens. To isolate phage on which heterodimer 
expression has been induced, 3 ml of SOC medium 
(SOC was prepared by admixture of 20 g 
bacto-tryptone, 5 g yeast extract and 0.5 g NaCl in 
one liter of water, adjusting the pH to 7.5 and 
admixing 20 ml of glucose just before use to induce 
the expression of the Fd-cpIII and light chain 
heterodimer) was admixed and the culture was shaken 
at 220 rpm for one hour at 37°C, after which time 
10 ml of SB (SB was prepared by admixing 30 g 
tryptone, 20 g yeast extract, and 10 g Mops buffer 
per liter with pH adjusted to 7) containing 20 
/xg/ml carbenicillin and 10 /xg/ml tetracycline and 
the admixture was shaken at 300 rpm for an 
additional hour. This resultant admixture was 
admixed to 100 ml SB containing 50 /xg/ml 
carbenicillin and 10 /xg/ml tetracycline and shaken 
for one hour, after which time helper phage VCSM13 
(10 12 pfu) were admixed and the admixture was shaken 
for an additional two hours. After this time, 70 
/xg/ml kanamycin was admixed and maintained at 3 0°C 
overnight. The lower temperature resulted in 
better heterodimer incorporation on the surface of 
the phage . The supernatant was cleared by 
centrifugation (4000 rpm for 15 minutes in a JA10 
rotor at 4°C) . Phage were precipitated by 
admixture of 4% (w/v) polyethylene glycol 8000 and 
3% (w/v) NaCl and maintained on ice for 3 0 minutes, 
followed by centrifugation (9000 rpm for 20 minutes 
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in a JA10 rotor at 4°C) . Phage pellets were 
resuspended in 2 ml of PBS and microcentrif uged for 
three minutes to pellet debris, transferred to 
fresh tubes and stored at -20°C for subsequent 
screening as described below. 

For determining the titering colony forming 
units (cfu) , phage (packaged phagemid) were diluted 
in SB and 1 /il was used to infect 50 /il of fresh 
(OD600 = 1) XLl-Blue cells grown in SB containing 
10 tig/ml tetracycline. Phage and cells were 
maintained at room temperature for 15 minutes and 
then directly plated on LB/carbenicillin plates. 

5) Selection of Ant i -HIV- 1 Heterodimers 
on Phage Surfaces 

(a) Multiple Panninqs of the Phage 
Library 

The phage library produced in 
Example 2b4) was panned against recombinant gpl20 
of HIV-l strain Illb as described herein on coated 
microtiter plate to select for anti-gpl20 
heterodimers. A second phage library was panned 
against recombinant gp41 (American Biotechnologies, 
Boston, MA) as described below to select for anti- 
gp41 heterodimers. 

The panning procedure used was a modification 
of that originally described by Parmley and Smith 
(Parmley et al., Gene , 73:305-318 (1988). Four 
rounds of panning were performed to enrich for 
specific antigen-binding clones. For this 
procedure, four wells of a microtiter plate (Costar 
3690) were coated overnight at 4°C with 25 fil of 40 
/ig/ml gpl20 or gp41 (American Biotechnologies) 
prepared above in 0.1 M bicarbonate, pH 8.6. The 
wells were washed twice with water and blocked by 
completely filling the well with 3% (w/v) BSA in 
PBS and maintaining the plate at 37 °C for one hour. 
After the blocking solution was shaken out, 50 /xl 
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of the phage library prepared above (typically 10 11 
cfu) were admixed to each well, and the plate was 
maintained for two hours at 37°C. 

Phage were removed and the plate was washed 
once with water. Each well was then washed ten 
times with TBS/Tween (50 mM Tris-HCl at pH 7.5, 150 
mM NaCl, 0.5% Tween 20) over a period of one hour 
at room temperature where the washing consisted of 
pipetting up and down to wash the well, each time 
allowing the well to remain completely filled with 
TBS/Tween between washings. The plate was washed 
once more with distilled water and adherent phage 
were eluted by the addition of 50 jxl of elution 
buffer (0.1 M HCl, adjusted to pH 2.2 with solid 
glycine, containing 1 mg/ml BSA) to each well 
followed by maintenance at room temperature for 10 
minutes. The elution buffer was pipetted up and 
down several times, removed, and neutralized with 3 
|il of 2 M Tris base per 50 /a1 of elution buffer 
used. 

Eluted phage were used to infect 2 ml of fresh 
(ODqqq s 1) E. coli XLl-Blue cells for 15 minutes at 
room temperature, after which time 10 ml of SB 
containing 20 /xg/ml carbenicillin and 10 /xg/ml 
tetracycline was admixed. Aliquots of 20, 10, and 
1/10 fil were removed from the culture for plating 
to determine the number of phage (packaged 
phagemids) that were eluted from the plate. The 
culture was shaken for one hour at 37°C, after 
which it was added to 100 ml of SB containing 50 
/xg/ml carbenicillin and 10 jig/ml tetracycline and 
shaken for one hour. Helper phage VCSM13 (10 12 pfu) 
were then added and the culture was shaken for an 
additional two hours. After this time, 70 fig /ml 
kanamycin was added and the culture was incubated 
at 37°C overnight. Phage preparation and further 
panning were repeated as described above. 
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Following each round of panning, the 
percentage yield of phage were determined, where % 
yield - (number of phage e luted/number of phage 
applied) X 100. The initial phage input ratio was 
determined by titering on selective plates to be 
approximately 10 11 cfu for each round of panning. 
The final phage output ratio was determined by 
infecting two ml of logarithmic phase XLl-Blue 
cells as described above and plating aliquots on 
selective plates. In the first panning for gpl20- 
reactive phage, 4.6 X 10 11 phage were applied to 
four wells and 7.7 X 10 5 phage were eluted. After 
the fourth panning 1.0 X 10 8 phage were eluted. 
From this procedure, 20 clones were selected from 
the Fab library for their ability to bind to 
glycosylated recombinant gpl20 from the I I IB strain 
of HIV-1. Five clones were selected from the Fab 
library specific for binding to gp41. The panned 
phage surface libraries were then converted into 
ones expressing soluble Fab fragments for further 
screening by ELISA as described below. 

In addition to panning on gpl20 of strain IIIB 
and gp41, also contemplated as antigens for panning 
of combinatorial libraries is recombinant gpl20 
(IIIB strain) produced in baculovirus and 
recombinant gpl20 (SF2 strain) produced in Chinese 
Hamster Ovary cells obtained as described by 
Steimer et al., Science , 254:105-108 (1991). 
Another antigen, a synthetic cyclic peptide, N=CH- 
(CH 2 ) 3 CO[SISGPGRAFYTG]NCH 2 CO-Cys-NH 2 (SEQ ID NO 48) 
prepared as described by Satterthwait et al . , 
Bulletin of the World Health Orga nization, 68: 
Suppl., 17-25 (1990) corresponding to the central 
most conserved part of the V3 loop of gpl20 was 
coupled to maleimide-activated BSA. The library 
was panned using 1, 2 or 4 ELISA wells coated with 
1 /ig of protein antigen or 10 BSA-peptide per 
well . Four rounds of panning were carried out for 
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each antigen as described above. Eluted phage from 
the final round were used to infect XLl-Blue cells. 
Four rounds of panning against the four antigens 
produced an amplification in eluted phage of 
between 100 and 1000 fold. The panned phage 
surface libraries were then converted into ones 
expressing soluble Fab fragments for further 
screening by EL ISA as described below. 

6) Preparation of Soluble Heterodimers 
and Characterization of Binding 
Specificity to HIV-1 Antigens 
In order to further characterize the 
specificity of the mutagenized heterodimers 
expressed on the surface of phage as described 
above, soluble Fab heterodimers from acid eluted 
phage were prepared and analyzed in EL ISA assays on 
HIV-1 derived antigen-coated plates and by 
competitive EL ISA. 

To prepare soluble heterodimers, phagemid DNA 
from the 20 gpl20 positive clones and the 5 gp41 
positive clones prepared above was isolated and 
digested with Spe I and Nhe I. Digestion with 
these enzymes produced compatible cohesive ends. 
The 4.7 kb DNA fragment lacking the gene III 
portion was gel -purified (0.6% agarose) and 
self -ligated. Transformation of E. coli XLl-Blue 
afforded the isolation of recombinants lacking the 
cpIII fragment. Clones were examined for removal 
of the cpIII fragment by Xho I - Xba I digestion, 
which should yield an 1.6-kb fragment. Clones were 
grown in 100 ml SB containing 50 jig/ml 
carbenicillin and 20 mM MgCl 2 at 37°C until an OD^ 
of 0.2 was achieved. IPTG (1 mM) was added and the 
culture grown overnight at 30°C (growth at 37°C 
provides only a light reduction in heterodimer 
yield) . Cells were pelleted by centrif ugation at 
4000 rpm for 15 minutes in a JA10 rotor at 4°C. 
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Cells were resuspended in 4 ml PBS containing 34 
jig/wl phenylmethylsulfonyl fluoride (PMSF) and 
lysed by sonication on ice {2-4 minutes at 50% 
duty) . Debris was pelleted by centrif ugation at 
14,000 rpm in a JA20 rotor at 4°C for 15 minutes. 
The supernatant was used directly for EL ISA 
analysis as described below and was stored at 
-20°C. For the study of a large number of clones, 
10 ml cultures provided sufficient heterodimer for 
analysis. In this case, sonications were performed 
in 2 ml of buffer. 

Assays as described above were also performed 
for the gp41-specif ic clones. 

a) Screening bv ELISA 

The soluble heterodimers 
prepared above were assayed by ELISA. For this 
assay, gpl20 and gp41 were separately admixed to 
individual wells of a microtiter plate as described 
above for the panning procedure and maintained at 
4°C overnight to allow the protein solution to 
adhere to the walls of the well. After the 
maintenance period, the wells were washed five 
times with water and thereafter maintained for one 
hour at 37°C with 100 /il solution of 1% BSA diluted 
in PBS to block nonspecific sites on the wells. 
Afterwards, the plates were inverted and shaken to 
remove the BSA solution. Twenty- five /il of soluble 
heterodimers prepared above reactive with the 
specific glycoprotein substrate were then admixed 
to each well and maintained at 37°C for one hour to 
form immunoreaction products. Following the 
maintenance period, the wells were washed ten times 
with water to remove unbound soluble antibody and 
then maintained with a 25 /xl of a 1:1000 dilution 
of secondary goat anti-human IgG F(ab') 2 conjugated 
to alkaline phosphatase diluted in PBS containing 
1% BSA. The wells were maintained at 37°C for one 
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hour after which the wells were washed ten times 
with water followed by development with 50 /il of 
p-nitrophenyl phosphate (PNPP) . Color development 
was monitored at 4 05 nm. Positive clones gave A4 05 
values of >1 (mostly >1.5) after 10 minutes, 
whereas negative clones gave values of 0.1 to 0.2. 

Approximate concentrations of gpl20- reactive 
Fab were determined by ELISA using a sandwich ELISA 
as described by Zebedee et al . , Proc. Natl. Acad. 
Sci.. USA , 89:3175-3179 (1992) and are presented in 
the first column of Figure 6. In addition, since 
Fabs are expressed in E. coli and the fraction of 
correctly assemble protein can vary, the amount of 
Fab reacting with gpl20 was also assessed by ELISA 
titration. That data is also presented in Figure 6 
in the second column. 

For the clones panned against the HIV-1 
derived antigens, after conversion of the panned 
phage surface libraries to ones expressing soluble 
Fab fragments, 30-40 colonies were used to 
transform XLl-Blue cells and the supernates 
screened in ELISA assays against the antigen used 
in panning. Generally greater than 80% of the 
supernates tested positive. A representative 
number of positives were then selected from each 
antigen panning for further analysis. 

(b) Competitive ELISA with Soluble 
o p!20 and CD4 

Immunore active heterodimers as 
determined in the above ELISA were then analyzed by 
competition ELISA to determine the affinity of the 
selected heterodimers. The ELISA was performed as 
described above on microtiter wells separately 
coated with 5 tig/ml of gpl20 or soluble CD4 
(American Biotechnologies) in 0.1 M bicarbonate 
buffer at pH 8.6. Increasing concentrations of 
soluble or free gpl20 ranging in concentration from 
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10' 11 M up to 10" 7 M diluted in 0.5% BSA/0.025% Tween 
20/PBS were admixed with soluble heterodimers, the 
dilutions of which were determined in titration 
experiments that resulted in substantial reduction 
of OD values after a 2- fold dilution. For the CD4 
competition assays, increasing concentrations of 
soluble or free CD4 ranging in concentration from 
10* 11 M up to 10" 6 M diluted in 0.5% BSA/0.025% Tween 
20/PBS were admixed with soluble heterodimers . The 
plates were maintained for 90-120 minutes at 37°C 
and carefully washed ten times with 0.05% Tween 
20/PBS before admixture of alkaline 
phosphatase-labelled goat anti-human IgG F(ab , )2 at 
a dilution of 1:500 followed by maintenance for 1 
hour at 37°C. Development was performed as 
described for ELISA. 

To establish the relationship between 
neutralizing ability as described in Example 3 
below could be related to antigen binding affinity 
of HIV-l-specif ic Fabs, competition ELISAs were 
carried out where soluble gpl20 was competed with 
gpl20 coated on ELISA plates for Fab binding. 
Figure 7 shows that all Fabs were competed from 
binding to gpl2 0 with a IC^ of approximately 10 9 M 
free gp!20. In addition as shown in Example 3, 
there is no correlation between antigen affinity 
and neutralization. The Fabs tested included Fabs 
4, 12, 21 and 7 that are members of the same groups 
as determined by sequence analysis and comparison 
as described in Example 9. Fabs 13, 27, 6, 29, 2 
and 3 are all members of the different groups as 
determined by sequence analysis and comparison as 
described in Example 9. Loop 2 is an Fab fragment 
selected from the same library as the other Fabs 
but which recognizes the V3 loop. Only with the V3 
loop peptide was competition carried out with gpl20 
from the SF2 strain. 
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To investigate whether neutralization could be 
associated with blocking of the gp!20-CD4 
interaction, competition ELISAs were carried out 
with soluble CD4 competing with Fabs for binding to 
gpl20-coated ELISA wells. The results are shown in 
Figure 8. P4D10 and loop 2 are controls not 
expected to be competed by CD4 . P4D10 is a mouse 
monoclonal antibody reacting with the V3 loop of 
gp!20 (IIIB) . Loop 2 Fab competition was carried 
out using gpl20 (SF2) . As shown in Figure 8 the 
binding of all Fabs with the exception of the 
controls was inhibited with an IC 30 of approximately 
10" 8 M of soluble CD4. In addition, no difference 
was detected between the neutralizing and 
non-neutralizing Fabs to gpl20 inhibited by CD4 . 
This implies that blocking of the CD4-gpl20 
interaction is unlikely to be an important factor 
in Fab neutralization of the HIV-1 virus. 

Similar competition assays were performed with 
the Fabs panned against the four HIV-1 derived 
antigens. The 19 Fabs derived from panning against 
gpl20 (IIIB) showed apparent affinities 
(1 /concentration at 50% inhibition) for gpl20 
(IIIB) in the range 10 7 - 10* 9 M with most being 1-3 
X 10" 8 M. The panning procedure tends to select 
strongly for tight binders so a grouping into a 
relatively narrow band of affinities was expected. 
Of 16 Fabs derived from panning against gpl60 
(IIIB) , 6 were also reactive with gpl20 (IIIB) and 
competition ELISAs showed they had similar apparent 
affinities as the gpl20 -panned Fabs. The non-gpl2 0 
reactive clones from the gplSO panning showed a 
lower ELISA reactivity with gpl60 and could not be 
satisfactorily competed with gpl60. They may be 
directed against gp41 but were not pursued here. 
Eight Fabs derived from panning against gpl20 (SF2) 
also showed strong ELISA reactivity with gpl20 
(IIIB) and gave similar apparent binding 
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affinities. Four Fabs were derived from panning 
against the V3 loop peptide. Of these Fabs, 2 
reacted in ELISA with gpl20 (SF2) but none with 
gpl20 (IIIB) . The apparent binding affinity of 
these loop binders to gpl20 (SF2) was 10' 8 M. 

To complete the survey in terms of strain 
cross-reactivity of Fabs, those derived from the 
gpl20 and gpl60 (IIIB) pannings were examined for 
ELISA reactivity with gpl20 (SF2) . All were 
reactive. Therefore, all the Fabs examined, with 
the exception of those selected by panning against 
the V3 loop peptide, bound to gpl20 from IIIB and 
SF2 strains. 

The Fabs were screened for CD 4 inhibition of 
their binding to gpl20 (IIIB) immobilized on ELISA 
wells. All, again with the exception of the V3 
loop binders, showed sensitivity to CD4 inhibition. 
The inhibition constants were in the range 10" 7 to 
lO -9 M. 

<c) Binding Affinity Determination 
Using Surface Pla smon Resonance 
Binding affinities were determined 
for six of the Fabs using surface plasmon resonance . 
Surface plasmon resonance was performed as it is a 
more accurate method for measuring affinity than 
competition ELISA. The six Fabs were chosen based 
upon sequence analysis which revealed that the heavy 
chains could be organized into 7 groups (Example 9) . 
Each group contained members with identical V-D and D- 
J joining regions, implying a common clonal origin 
with varying numbers of differences elsewhere in the 
VH domain. Six Fabs were chosen as a representative 
of each respective group for further study as 
described herein. The single member of the seventh 
group was not included in these studies. The binding 
affinities of the six Fabs that are directed against 
the CD4 binding site of the gp!20 envelope 
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glycoprotein were determined using surface plasmon 
resonance as follows. 

A Pharmacia BIAcore machine was used for the 
binding affinity determinations as previously 
described in Malmborg, et al . , J. Immunol. . 35:643- 
650 (1992) and Mattsson, et al., J. Immunol. Meth. . 
145:229-240 (1991). Optimization for the Fab 
fragments involved a number of steps. Two separate 
channels on a biosensor chip were coated with gpl20 
derived from the HIV-1 strain LAI (Repligen, 
Cambridge MA) such that one channel could be used 
for the determination of on-rate constants (k on ) and 
the other for the determination of off-rate 
constants (k off ) . 

For immobilization of antigen on the sensor 
surfaces, a flow rate of 5 /xl/min of PBS, pH 7.4 
was established over the biosensor chip. The chip 
was then activated by injecting 30 pi of activation 
solution (Pharmacia Biosensor, 50% 0.2 M N-ethyl- 
N* - (3-diethylaminopropyl) -carbodiimide, 50% N- 
hydroxysuccinimide) . The flow rate was then 
adjusted to 10 jil/min and the gpl20 was injected in 
10 mM sodium acetate buffer, pH 4.5. When 
association rates were to be determined, 25 /xl of 
gpl20 at 10 fig/ml was injected (a final level of 
4000 Response Units (RU) ) . Twenty pi of gpl20 at 2 
/xg/ml were injected for the determination of 
dissociation constants (a final level of 800 RU) . 
In both cases, a flow rate of 5 /xl/min was 
reestablished following the gpl20 injection and the 
chip was blocked from any further immobilization by 
the injection of 3 0 /il of 1 M ethanolamine, pH 8 . 5 
(Pharmacia Biosensor) . 

For determination of on-rate constants (k on ) , a 
series of dilutions were made for each Fab to give 
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final concentrations in the range of 1 to 20 ^g/ml. 
30 ptl of each Fab solution was injected in separate 
experiments over the immobilized gpl20 at a flow 
rate of 5 fil/min. The change in response per unit 
time (dR/dt) was plotted against time (t) for each 
concentration. The slopes of each of these graphs 
were then plotted against their corresponding 
concentrations to give a final graph from which the 
on- rate constant could be read. 

For determination of off-rate constants (k off ) , 
30 fil of each Fab solution at 150 fig/ml were 
injected over the immobilized antigen at a flow 
rate of 5 jil/min. Once the reaction had reached 
equilibrium, the Fab was removed from the antigen 
at a constant flow rate of 50 jil/min. A plot was 
then made of InfRi/Ro) against t r t 0 for the 
dissociation phase. R A is the response at time t t 
and R 0 is the initial response at time t 0 . The 
slope of this graph was taken to be the off-rate 
constant. Affinities (K a ) were then calculated and 
expressed as k on /k off . 

The apparent affinities of the panel of 
recombinant Fabs isolated from the donor as 
determined in competition EL ISA and surface plasmon 
resonance were compared. Values of approximately 
10 8 M* 1 were obtained by competition ELISA as 
described in Example 2b6c in which the soluble and 
immobilized gpl20 competed for binding to Fab in 
bacterial supernatants . Such a methodology only 
gives an approximate measure of affinity. 
Therefore, the affinities of six of these Fabs were 
measured using real-time biospecific interaction 
analysis (surface plasmon resonance) in order to 
obtain more accurate affinity constant values. The 
results are reproducible with a standard deviation 
from the mean of approximately 5% as determined by 
calculating a number of the affinity constants in 
triplicate. All Fabs examined have affinities in 
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the range of 5 x 10 7 to 1 x 10 8 M* 1 as determined in 
surface plasmon resonance (Table 4). These values 
are in broad agreement with those derived from 
competition ELISA. These values imply no 
correlation between affinity for recombinant gpl20 
derived from LAI and the ability to neutralize the 
HXBc2 clone of HIV-1 derived from LAI as assessed 
in Example 3c. 



Table 4 



Fab 


k on (MV') 






b3 


9.6 x 10 3 


1.8 x 10 -4 


5.1 x 10 7 


b6 


1.6 x 10 4 


1.6 x 10* 4 


9.7 x 10 7 


bll 


5.6 x 10 4 


4.3 X 10' 4 


1.3 x 10 fl 


bl2 


4.5 x 10 4 


4.3 x 10* 4 


1.1 x 10 8 


bl3 


1.1 x 10 4 


1.4 x 10* 4 


7.9 x 10 7 


bl4 


6.0 x 10 4 


6.5 x 10" 4 


9.2 x 10 7 



Also contemplated are competition ELISA and 
surface plasmon resonance assays where the binding 
of HIV-1 recombinant Fabs of this invention is 
performed in the presence of excess Fabs of this 
invention as well as those HIV-1 antibodies, 
polyclonal or monoclonal, present in patient sera, 
either asymptomatic or symptomatic, or obtained by 
other means such as EBV transformation and the 
like. The ability of an exogenously admixed 
antibody to compete for the binding of a 
characterized Fab of this invention will allow for 
the determination of equivalent antibodies in 
addition to unique epitopes and binding 
specificities. 
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3 . Neutralizing Activity of Recombinant Human Fab 
Fragments Against HIV-1 In Vitro 

Binding of antibodies to viruses can result in 
5 loss of infectivity or neutralization and, although 

not the only defense mechanism against viruses, it 
is widely accepted that antibodies have an 
important role to play. However, understanding of 
the molecular principles underlying antibody 

10 neutralization is limited and lags behind that of 

the other effector functions of antibody. Such 
understanding is required for the rational design 
of vaccines and for the most effective use of 
passive antibody for prophylaxis or therapy. This 

15 is particularly urgent for the human 

immunodeficiency viruses. 

A number of studies have led to the general 
conclusion that viruses are neutralized by more 
than one mechanism and the one employed will depend 

20 on factors such as the nature of the virus, the 

epitope recognized, the isotype of the antibody, 
the cell receptor used for viral entry and the 
virus : antibody ratio. The principle mechanisms of 
neutralization can be considered as aggregation of 

25 virions, inhibition of attachment of virus to cell 

receptor and inhibition of events following 
attachment such as fusion of viral and cellular 
membranes and secondary uncoating of the virion. 
One of the important features of the third 

30 mechanism is that it may require far less than the 

approximately stoichiometric amounts of antibody 
expected for the first two mechanisms since 
occupation of a small number of critical sites on 
the virion may be sufficient for neutralization. 

35 For instance it has been shown that neutralization 

of the influenza A virion obeys single hit kinetics 
as described by Outlaw et al . , Epidemiol . Infect . . 
106 :205-220 (1992) . 
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Intensive studies have been carried out on 
antibody neutralization of HIV-1. For review, see 
Nara et al . , FASEB J. , 5:2437-2455 (1991). Most 
have focussed on a single linear epitope in the 
5 third hypervariable domain of the viral envelope 

glycoprotein gpl20 known as the V3 loop. 
Antibodies to this loop are suggested to neutralize 
by inhibiting fusion of viral and cell membranes. 
Binding to the loop resulting in neutralization can 

10 occur prior to virus-cell interaction or following 

gpl20 binding to CD4 . See, Nara, In Retroviruses 
of Human Aids and Related Animal Diseases, eds. 
Girard et al . , pp. 138-150 (1988); Linsely et al . , 
J. Virol. . 62:3695-3702 (1988); and Skinner et al., 

15 J. Virol. , 67:4195-4200 (1988). Features of the V3 

loop are sequence variability within the loop 
[Goudsmit et al., FASEB J . . 5:2427-2436 (1991) and 
Albert et al./ AIDS , 4:107-112 (1990)] and 
sensitivity of neutralizing antibodies against the 

20 loop to sequence variations outside the loop [Nara 

et al., FASEB J . , 5:2437-2455 (1991); Albert et 
al., supra ; McKeating et al., AIDS . 3:777-784 
(1989); and Wahlberg et al., AIDS Res. Hum. 
Retroviruses . 7:983-990 (1991). Hence anti-V3 loop 

25 antibodies are often strain specific and mutations 

in the loop in vivo may provide a mechanism for 
viral escape from antibody neutralization. 

Recently considerable interest has focused on 
antibodies capable of blocking CD4 binding to 

30 gpl20. A number of groups have described the 

features of these antibodies as (a) reacting with 
conformational i.e., non-linear epitopes, (b) 
reacting with a wide range of virus isolates and 
(c) being the predominant neutralizing antibodies 

35 in humans after longer periods of infection. See, 

Berkower,et al . , J. Virol. , 65:5983-5990 (1991); 
Steimer et al . , Science . 254:105-108 (1991); Ho et 
al., J. Virol . , 65:489-493 (1991); Kang et al . , 
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Proc. Natl- Acad. Sci . , USA , 88:6171-6175 (1991); 
Posner et al., J. Immunol. , 146:4325-4332 (1991); 
and Tilley et al., Res. Virol. . 142:247-259 (1991). 
Neutralizing antibodies of this type would appear 
5 to present a promising target for potential 

therapeutics. The mechanism (s) of neutralization 
of these antibodies is unknown although there is 
some indication that this may not be blocking of 
virus attachment since a number of mouse monoclonal 

10 antibodies inhibiting CD 4 binding to gpl20 are 

either non-neutralizing or only weakly 
neutralizing. 

The generation of human monoclonal antibodies 
against the envelope of HIV-1 as described by 

15 Burton et al . , Proc. Natl. Acad . Sci.. USA. 

88:10134-10137 (1991) using combinatorial libraries 
allows a novel approach to the problem of 
neutralization. Given the lack of a 
three-dimensional structure for gpl20 and the 

20 complexity of the virus, the approach seeks to 

explore neutralization at the molecular level 
through the behavior of related antibodies. This 
is possible for the following reasons: (1) the 
combinatorial approach allows the rapid generation 

25 of large numbers of human antibodies; (2) the 

antibodies (Fab fragments) are expressed in E.coli 
and can readily be sequenced; and (3) antibodies 
have similar sequences and common structural motifs 
allowing functional differences to be meaningfully 

30 correlated with primary structure. 

Neutralization studies were performed as 
described herein on the human recombinant Fab 
fragments from 20 clones against gp!20 prepared as 
described in Examples 1 and 2, all of which are 

35 strain cross -reactive and inhibited by CD4 from 

binding to gpl20. The results presented herein 
show that neutralization was not effected by virus 
aggregation or cross-linking of gp!20 molecules on 
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the virion surface and was not correlated with 
blocking of the interaction between soluble CD4 and 
recombinant gpl20. 

Neutralization studies were also performed as 
described herein on the human recombinant Fab 
fragments from the gp41- reactive clones prepared as 
described in Examples 1 and 2. The results are 
presented below. 

Two different assays, a p24 EL ISA assay and a 
syncytium assay, were performed to measure 
neutralization ability of the recombinant human 
HIV-1 immunoreactive Fabs. An additional assay, a 
plaque assay, was performed for determining the 
neutralization effectiveness of the gp41-reactive 
Fabs. In plaque assays, CD4+ cells were cultured 
in the presence or absence of soluble gp41- reactive 
Fabs prior to inoculation with virus. Inhibition 
of infectivity, also referred to as neutralization, 
by antibodies was expressed as the percent of 
plaque formation in the cultures compared to cells 
exposed to PBS alone. 

Neutralization assays were also performed with 
an antibody molecule consisting of the light chain 
and the VH region of the Fab 12 and the constant 
regions (CHI, CH2, and CH3) of an IgGl molecule. 
Quantitative infectivity microplaque and syncytial 
formation assays to measure neutralization were 
performed with the bl2 IgGl and laboratory isolates 
MN and IIIB of HIV-1 virus. In the syncytial 
formation assay, virus was grown in H9 cells and 
infectivity measured by culturing monolayers of 
CEM-SS target cells with 100-200 syncytial forming 
units (SFUs) of virus, in the presence or absence 
of antibody. p24 ELISA and microplaque formation 
assays were also performed with primary clinical 
isolates of the HIV-1 virus. 

In addition, the ability of the recombinant 
human HIV-l immunoreactive Fabs b3 , b6 , bl2, bl3, 
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and bl2 to neutralize the HXBc2 molecular clone of 
gpl20 derived from HTLV-IIIB (LAI) was determined 
in an envelope complementation assay. The 
supernatant containing recombinant HIV-1 virions 
from cotransf ected COS-1 cells was incubated with 
the recombinant Fabs prior to incubation with 
Jurkat cells. The recombinant HIV-1 virions 
contained the HXBc2 clone of HIV-1 strain LAI which 
encodes a chloramphenicol acetyltransf erase (CAT) 
gene. Upon infection of Jurkat cells with the 
recombinant HIV-1 virions, the CAT gene was 
expressed and CAT activity measured. Activity of 
the CAT gene was therefore an indication of 
infectivity of the Jurkat cells with the 
recombinant HIV-1 virion. Lack of CAT activity 
indicated the Jurkat cells were not infected with 
the recombinant HIV-l virion. 

For some of these assays, the recombinant Fabs 
were first purified. One liter cultures of SB 
containing 50 /ig/ml carbenicillin and 20 mM MgCl 2 
were inoculated with appropriate clones and induced 
7 hours later with 2 mM IPTG and grown overnight at 
30°C. The cell pellets were sonicated and the 
resultant supernatant were concentrated to a 50 ml 
volume. The filtered supernatants were loaded on a 
25 ml protein G-anti-Fab column, washed with 120 ml 
buffer at a rate of 3 ml/minute and eluted with 
citric acid at pH 2.3. The neutralized fractions 
were then concentrated and exchanged into 50 mM MES 
at pH 6.0 and loaded onto a 2 ml Mono-S column at a 
rate of 1 ml/minute. A gradient of 0-500 mM NaCl 
was run at 1 ml /minute with the Fab eluting in the 
range of 200-250 mM NaCl . After concentrating, the 
Fabs were positive when titered on ELISA against 
gpl20 and gave a single band at 50 kD by 10-15% 
SDS-PAGE. Concentration was determined by 
absorbance measurement at 280nm using an extinction 
coefficient (1 mg/ml) of 1.4. 
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a . Neutralization as Measured bv the d24 
ELI5A Assay 

For this assay, diluted tissue culture 
supernatants of HIV-1 IIIB or MN-infected 
peripheral blood mononuclear cells (PBMC) (SOTCID^ 
(50% tissue culture infectious dose) , 100 /il) were 
maintained for 2 hours at 37°C with serial 
dilutions (1:2), beginning at a dilution of 1:20, 
of recombinant Fab supernates prepared in Example 
2b6) . Control Fab supernates were also provided 
that included human neutralizing sera, a known 
human neutralizing monoclonal antibody 2F5 and the 
Fab fragment derived from that antibody by papain 
digestion, and a known mouse neutralizing 
monoclonal antibody and its F(ab') 2 fragment as 
described by Broliden et al . , J. Virol. , 64:936-940 
(1990) . PBMC (1 x 10 5 cells ) were admixed to the 
virus/antibody admixture and maintained for 1 hour 
at 37°C. Thereafter, the cells were washed and 
maintained in RPMI 1640 medium (GIBCO) supplemented 
with 10% fetal calf serum, 1% glutamine, 
antibiotics and IL-2. The culture medium was 
changed at days 1 and 4. At 7 days post-infection, 
supernates were collected and analyzed by HIV-1 p24 
antigen capture EL ISA as described by Sundqvist et 
al., J. Med. Virol. . 29:170-175 (1989) the 
disclosure of which is hereby incorporated by 
reference. Neutralization was defined as positive 
if an 80% or greater reduction of optical density 
at 4 90nm in the culture supernatant occurred as 
compared to negative Fab or negative human serum. 
Tests with all Fabs, mAbs and sera were repeated on 
at least two occasions. 

b. Quantitative Infect ivitv Assay Based on 
Syncytial Formation 

A quantitative neutralization assay with 
the MN strain of HIV-1 was performed as described 
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by Nara et al., AIDS Res. Human Retroviruses , 
3:283-302 (1987), the disclosure of which is hereby 
incorporated by reference. Monolayers of CEM-SS 
target cells were cultured with virus, in the 
presence or absence of antibody, and the number of 
syncytia forming units determined 3-5 days later. 
An equivalent amount of virus was used in the 
assays to allow direct comparison of the various 
antibody concentrations tested. The assays were 
repeatable over a virus -surviving fraction range of 
1 to 0.001 within a 2 to 4-fold difference in the 
concentration of antibody (P<0.001). 

c . Neutralization as Measured bv the 
Envelope Complementation Assay 
The ability of purified recombinant Fabs 
b3, b6, bll, bl2, bl3, and bl4 to neutralize the 
HXBc2 gpl20 molecular clone of the HIV-1 (LAI) 
isolate was assessed in an envelope complementation 
assay (Helseth et al . , J. Virol., 65:2119-2123 
(1991) ) . Briefly, COS-1 cells were cotransf ected 
with a plasmid expressing envelope glycoprotein 120 
derived from HIV-1 (LAI) and a plasmid containing 
an env -def ective HXBc2 clone and encoding the 
bacterial CAT gene. Equal fractions of the cell 
supernatants containing recombinant virions were 
incubated at 37°C for 1 hour with varying 
concentrations of recombinant Fab (0.1 - 20 /ig/ml) 
or control monoclonal antibody 110.4 prior to 
incubation with Jurkat cells. Three days post- 
infection, the Jurkat cells were lysed and CAT 
activity measured. Neutralization was expressed as 
a decrease in the percentage of residual 
chloramphenicol transferase (CAT) activity. 
Control monoclonal antibody 110.4 is a strongly 
neutralizing antibody directed to the V3 loop of 
the HXBc2 HIV-1 strain. 



d. Results of the Neutralization Assays for 
gpl20 

Assays were generally repeated at least 
twice with reproducible results. For the data 
reported in Figure 6, the gpl20-specif ic Fab 
supernates were divided into two parts, one being 
used in the p24 assay and the other in the syncytia 
assay. A dash (-) indicates that there was no 
neutralization at 1:20 dilution in the p24 assay 
and 1:16 in the syncytial assay (with most clones 
showing no detectable neutralization at a 1:4 
dilution) . Neutralization titers are indicated in 
the figure. For the p24 assay, the titer 
corresponds to the greatest dilution producing >80% 
reduction in absorbance in ELISA. For the syncytia 
assay, Fabs 4 and 12 produced >95% neutralization 
at a 1:4 dilution of supernate and 80 and 70% 
reduction at 1:128 dilution respectively. These 
Fabs were effective neutralizers in both types of 
assays. They have also been shown to neutralize 
infection by IIIB and RF strains using a PCR-based 
assay of proviral integration. Fabs 6 and 7 showed 
no neutralization in the syncytia assay but other 
supernate preparations showed activity. Fab 13 was 
consistently effective in the p24 assay but not in 
the syncytia assay. A number of other clones show 
lower levels of neutralizing ability. 

Fabs were purified from a selection of some of 
the clones as described above and used in both 
neutralization assays. As shown in Figure 9, Fabs 
4 and 12 were again effective in both assays at 
similar levels with for example 50% inhibition of 
syncytial formation at an Fab concentration of 
approximately 20 nM (1 jig/ml) . The results shown 
are derived from the syncytia assay using the MN 
strain. Fabs 7 and 21 were equally effective in 
the syncytial assay but somewhat less so in the p24 
assay. The p24 assay indicated greater than 80% 
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neutralization of HIV-1 MN strain for Fab 4 at 3, 
Fab 7 at 15, Fab 12 at 3, Fab 13 at 4 and Fab 21 at 
7 /xg/ml, respectively. Fab 13 however was 
ineffective in the syncytial assay at 25 fig/ml. 
For the IIIB strain, greater than 80% 
neutralization was observed for Fab 4 at 13, Fab 7 
at 15, Fab 12 at 7 and Fab 21 at 14 /xg/ml, 
respectively. Although Fab 11 was not effective in 
neutralization assays when unpurified as shown in 
Figure 6, following purification, Fab 11 was 
equally effective as Fab 12 in neutralizing HIV-1. 
For this reason, the Fab is being deposited with 
the ATCC as described in Example 12 along with Fab 
12 and Fab 13 . 

The ability of purified recombinant Fabs b3, 
b6, bll, bl2, bl3, and bl4 to neutralize the HXBc2 
gpl20 molecular clone of the HIV-1 (LAI) isolate 
was assessed in an envelope complementation assay. 
Figure 23 shows the concentration dependence of Fab 
neutralization of the HXBc2 clone in this assay. 
All of the Fabs neutralize effectively at the 
highest concentration measured (20 ptg/ml) . 
Irrelevant Fabs, Fabs directed to surface 
glycoproteins on other viruses such as RSV, do not 
neutralize in this assay. Examination of the lower 
concentrations clearly reveals that Fab bl2 is the 
most effective neutralizer. The neutralizing 
potency of Fab bl2 was greater than that of the 
110-4 whole monoclonal antibody tested in parallel. 
The 110.4 antibody is one of the most potent 
antibodies directed against the V3 loop of the 
HXBc2 HIV-1 strain (Thali, M. and J. Sodroski, 
unpublished observations) . In other studies, Fab 
bl2 has been found to show exceptional neutralizing 
ability towards laboratory (Example 3 and Barbas et 
al., Proc. Natl. Acad. Sci.. USA , 91, in press 
(1994)) and field isolates of HIV-1 as described in 
Example 5 . 



There are a number of conclusions arising from 
the data shown in the Figures 6, 9 and 23. It is 
apparent that HIV-l can be neutralized without 
virion aggregation or cross -linking of gpl20 
molecules on the virion surface since monovalent 
Fab fragments are effective. To further confirm 
this finding, a Fab fragment was produced by papain 
digestion of a known neutralizing human monoclonal 
antibody. As shown in Figure 6, the Fab fragment 
was approximately equally effective as the whole 
IgG in neutralization of the MN strain of HIV-l. 
This is consistent with results on Fabs prepared 
from two mouse monoclonal antibodies to the V3 
loop. An F(ab*) 2 fragment of a mouse monoclonal 
antibody was somewhat less effective than the 
parent IgG in neutralization of the MN strain. 
Interestingly, the fragments from these control 
antibodies were relatively poor in neutralizing the 
I I IB strain of HIV-l. The results also show that 
there appears* to be a difference between the two 
assays employed since Fab 13 was consistently 
effective in one assay but not the other. The 
principal variables were the incubation time of the 
virus and antibody prior to infection (2 hours for 
the p24 assay and 0.5 hours for the syncytial 
assay) , the amount of virus used for infection, the 
cells used to propagate virus (human PBMCs for the 
former and H9 cells for the latter) and the cells 
infected (human PBMCs for the former and CEM.SS 
cells for the latter) . Of these, there is a strong 
possibility that the MN virus used in the two 
assays, having been passaged through different 
cells, is critically different. 

The Fabs show a spectrum of neutralizing 
ability for gpl2 0 from a molecular clone HXBc2 
derived from the HIV-l strain LAI in the envelope 
complementation assay. Fab bl2 exhibited the 
greatest potency of neutralization and was even 
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more effective in this assay than a whole antibody 
directed to the V3 loop of gpl20. Neutralizing 
ability is not correlated with either the apparent 
affinity of the Fab for gpl20 derived from the 
5 recombinant HIV-1 strain LAI as estimated by 

competition EL ISA or the affinity for gpl20 derived 
from HIV-1 strain LAI as determined by surface 
plasmon resonance. For example, Fabs b6, bl2, and 
bl4 have very similar affinities by surface plasmon 
10 resonance (Table 4) but different neutralization 

ability in the envelope complementation assay 
(Figure 23) . Similarly, neutralization is not 
correlated with the ability of the Fab to compete 
with soluble CD4 in a competition ELISA. 

15 

e. Results of the Neutralization Assays for 
gp41 

The gp41-reactive Fabs exhibited 
specificity to the conformation epitope of gp41 

20 including amino acid residues in positions 565-585 

and 644-663. The five selected gp41-specif ic Fabs 
were designated DL 41 19, DO 41 11, GL 41 1, MT 41 
12 and SS 41 8. Neutralization assays were 
performed as described above for the gpl20- reactive 

25 Fabs. In the plaque assays, the data shown is the 

concentration of Fab in micrograms/milliliter 
required to achieve 50% of neutralization. The 
data for the other two neutralization assays is 
also expressed in micrograms/milliliter of Fab 

30 required to neutralize infection as defined in the 

description of the p24 and syncytial assays above. 
The results of the three neutralization assays, 
plaque, syncytial and p24, are presented in Table 
5. The MN and IIIB HIV strains were used as 

35 indicated in Table 5 for the assays. The 

abbreviation "ND" stands for not determined when 
indicated in the table. 
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Table 5 
Assay/Strain 









Plaoue 


Syncytial 


P24 




Fab 




MN 


IIIB 


IIIB 


MN IIIB 


DL 41 


19 


<4 


<40 


1.4 


ND 


ND 


DO 41 


11 


<40 


7.1 


2.3 


0.9 


ND 


GL 41 


1 


<4 


<4 


1.7 


ND 


3.5 


MT 41 


12 


<40 


<40 


5.5 


4.5 


4.5 


SS 41 


8 


<4 


<4 


2.2 


ND 


7.1 



As shown in Table 5, all five Fabs were 
effective at neutralizing both MN and IIIB strains 
of HIV in either plaque, syncytial or p24 assays. 
Fabs DL 41 19 and DO 41 11 exhibited strain 
specificity in the plaque assay where the former 
was ten- fold more effective at inhibiting plaque 
formation with the MN strain than with the IIIB 
strain. The opposite specificity was seen with the 
DO 41 11 Fab. However, both Fabs exhibited 
comparable neutralization as measured by the 
syncytial assay. Two Fabs, GL 41 1 and SS 41 8, 
were equally effective at inhibiting plaque 
formation with either MN or IIIB strains. The Fab 
MT 41 12 was similarly not strain-specific although 
neutralization required 10 fold more antibody. No 
strain specificity was evident when Fab MT 41 12 
was used in p24 assays where the same amount of 
antibody was equally effective. All five 
antibodies were neutralized IIIB as measured in the 
syncytial assay. 

Thus, the five gp41-specif ic Fabs neutralized 
HIV-l MN and IIIB in at least two of the three 
assays used for measuring neutralizing activity. 
Moreover, strain specificity was prevalent in two 
of the five assays as measured by the plaque assay. 
Based on these differential neutralization 
characteristics, the gp41-specif ic Fabs provide 
useful therapeutic reagents for neutralizing HIV-l. 
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4 . Construction of a Mammalian Expression Vector 
pEe!2 Combo BM 12 for the Expression of an 
IaGl Antibody Molecule with the Fab from b!2 
(b!2 iqgl) 

Although Fab bl2 is capable of neutralizing 
some primary isolates, the corresponding whole 
antibody molecule is likely to be more effective. 
The whole antibody, consisting of the Fab fragment 
and the Fc domain, participates in the elimination 
of foreign cells by first binding specifically to 
the foreign cell via the Fab portion and 
interacting with other cells in the immune system 
via the Fc domain. The Fc domain also enables the 
antibody to bind complement. 

Fab bl2 was converted to a whole IgGl molecule 
<bl2 IgGl) by cassetting the variable heavy chain 
(VH) and light chain genes into a vector created 
for high-level mammalian expression. bl2 IgGl used 
in the neutralization studies was prepared by 
expression in Chinese hamster ovary (CHO) cells and 
purified by affinity chromatography. 

The strategy to convert the Fab bl2 to a whole 
IgGl molecule was similar to that described 
previously for the generation of a whole antibody 
beginning with a phage derived Fab (Bender, et al., 
Hum. Antibod. Hvbridomas . 4:74-79 (1992)). 

a. Construction of b!2 Heavy Chain IaGl pSG- 
5 Mammalian Expression Vector 
1) Modification of b!2 Heavy Chain 

Variable Re gion to Introduce a Kozak 
Sequence. Mammalian Leader Sequence, 
and Human VH Consensus Sequence 
First, the bl2 VH region was cloned 
into a pSG-5 expression vector (Green et al . , Nucl . 
Acids Res . , 16:369 (1988)) to fuse the bl2 VH to 
the heavy chain constant domains (CHI, CH2, and 



CH3) of an IgGl antibody molecule. The 
double -stranded Fab bl2 DNA was used as a template 
for isolating the gene encoding the VH region of 
the Fab bl2, the amino acid residue sequence of 
which is listed in SEQ ID NO 66. Fab bl2 DNA and 
mouse B73.2 IgGl DNA (Whittle, et al . , Protein 
Eng. . 1:499 (1987) and Bruggmeman, et al., J . Exp . 
Med. . 166:1351 (1987)) were used as templates for a 
PCR amplification for the construction of a DNA 
fragment consisting of the unique Kozak sequence 
for the control of heavy chain expression, the 
mouse B72.3 heavy chain leader sequence 
( MEWS WVFLFFLS VTTGVHS (SEQ ID NO 155 from amino acid 
residue sequence 1 to 20) ) , the human VH consensus 
sequence (QVQLVQ (SEQ ID NO 155 from amino acid 
residue sequence 21 to 26)), and the VH region of 
the Fab bl2. Altering the beginning of the VH from 
the mouse consensus sequence to the human consensus 
sequence also destroyed the original Xho I cloning 
site. The restriction sites EcoR I and Sst I were 
introduced in the amplification reaction and were 
located at the 5' and 3 1 ends of the fragment, 
respectively. The procedure for creating the 
modified VH fragment by combining the products of 
the two separate PCR amplifications is described 
below. 

The primer pair, HC-1 (SEQ ID NO 157) and HC-2 
(SEQ ID NO 158) as shown in Table 10, was used in 
the first PCR reaction to amplify a portion of the 
Fab bl2 VH gene and incorporate the human heavy 
chain consensus sequence into the 5' end of the VH 
fragment and introduce an Sst I cloning site in the 
3' end of the VH fragment. In addition, the 5' PCR 
primer introduces sequences into the VH fragment 
which form 27 base pairs of homology with the mouse 
leader sequence fragment prepared below. The 27 
base pairs of homology in the fragments is used in 
a subsequent PCR reaction to fuse the two PCR 
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products {Yon and Fried, Nucl . Acids Res. , 17:4895 

(1989)) to form a modified VH fragment consisting 
of the EcoR I cloning site, the mouse leader 
sequence 72.3, the human consensus sequence, the 
remaining VH coding sequence, and the Sst I cloning 
site. For the PCR reactions, 1 /xl containing 100 
ng of Fab bl2 DNA was admixed with 10 jxl of 10X PCR 
buffer in a 0.5 ml microfuge tube. To the DNA 
admixture, 8 pi of a 2 . 5 mM solution of dNTPs 

(dATP, dCTP, dGTP, dTTP) was admixed to result in a 
final concentration of 200 micromolar (/iM) of each 
dNTP. 1 ^1 (equivalent to 20 picomoles (pM) ) of 
the 5' forward HC-1 primer and 1 /il (20 pM) of the 
3 1 backward HC-2 primer were admixed into the DNA 
solution. To the admixture, 73 /il of sterile water 
and 2.5 units of Taq DNA polymerase was added. Two 
drops of mineral oil were placed on top of the 
admixture and 35 rounds of PCR amplification in a 
thermocycler were performed. The amplification 
cycle consisted of 52°C for 1 minute, 72°C for 2 
minutes and 94 °C for 0.5 minutes. 

The primer pair, HC-3 (SEQ ID NO 159) and HC-4 

(SEQ ID NO 160) as shown in Table 10, was used in a 
separate PCR reaction to amplify the mouse B72.3 
leader sequence and incorporate an EcoR I cloning 
site at the 5 • end of the fragment and to introduce 
a 27 base pair sequence which has homology to the 
modified VH fragment prepared above. Double- 
stranded DNA encoding the mouse B73 . 2 IgGl 

(Whittle, et al . , supra ) was used as a template for 
preparation of the mouse 72.3 leader sequence. The 
PCR reaction to prepare the mouse leader sequence 
fragment was performed using the same conditions as 
described above for the preparation of the modified 
VH fragment . 

The resultant PCR modified bl2 VH DNA fragment 
and mouse leader sequence fragment were purified by 
electrophoresis in a 2.5% Nu-Sieve agarose gel 



(FMC) . The area in the agarose containing the 
modified bl2 VH DNA fragment and mouse leader 
sequence fragment were excised from the agarose. 

A third PCR amplification using the primer 
pairs, HC-1 (SEQ ID NO 157) and HC-3 (SEQ ID NO 
159) as shown in Table 10, was performed to fuse 
the mouse leader fragment with the modified VH 
fragment. The primers used for this amplification 
were designed to preserve an EcoR I site, a unique 
Kozak sequence, and the mouse B72.3 heavy chain 
leader sequence on the 5* end of the amplified 
fragment and to preserve the Sst I cloning site on 
the 5' end of the amplified fragment. The 
templates used in this PCR reaction were the two 
purified PCR reaction products described above. 
The PCR reaction and subsequent purification of the 
PCR product were performed as described above. 

2) Modification of b!2 Heavy Chain 

Variable Region to Eliminate a Bglll 

RSStirictAon site 
The bl2 modified heavy chain 
fragment prepared in Example 4al contained a Bgl II 
cloning site at amino acid residue 87 which would 
interfere with the insertion of the heavy chain 
fragment into the pEE6 mammalian expression vector. 
The Bgl II restriction site was therefore 
eliminated in a PCR reaction using primers which 
destroyed the Bgl II restriction site while 
preserving the encoded amino acid, arginine at 
amino acid residue 87 of the modified bl2 heavy 
chain fragment . 

The primer pair, HC-1 (SEQ ID NO 157) and HC-6 
(SEQ ID NO 162) as shown in Table 10, was used in 
the first PCR reaction to preserve the 5" region of 
the modified bl2 heavy chain fragment and destroy 
the Bgl II restriction site at amino acid residue 
87 of the heavy chain. The HC-6 primer introduces 
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sequences into the VH fragment which form 32 base 
pairs of homology with the remaining portion of the 
VH fragment which will be prepared as described 
below. The 32 base pairs of homology in the 
fragments was used in a subsequent PCR reaction to 
fuse the two PCR products (Yon and Fried, supra ) to 
form a modified VH fragment as described above but 
without the Bgl II restriction site. The PCR 
reaction was performed and the PCR products were 
purified as described in Example 4al. 

The primer pair, HC-2 (SEQ ID NO 142) and HC-5 
(SEQ ID NO 145) as shown in Table 10, was used in 
the second PCR reaction to preserve the 3 ' region 
of the modified bl2 heavy chain fragment and 
destroy the Bgl II restriction site. The HC-5 
primer introduces sequences into the VH fragment 
which form 32 base pairs of homology with the 
remaining portion of the VH fragment which was 
prepared in the first PCR reaction. PCR products 
which have incorporated the HC-5 and HC-6 primers 
contain 32 base pairs of overlapping sequences 
which are identical. It is the annealing of the 
two PCR products at these 32 base pairs during the 
subsequent PCR reaction which fuses the two 
portions of the VH fragment together to recreate 
the entire VH fragment as described in Yon and 
Fried ( supra) . 

A third PCR amplification using the primer 
pairs, HC-l (SEQ ID NO 157) and HC-3 (SEQ ID NO 
159) as shown in Table 10, was performed to fuse 
the two VH fragments in which the Bgl II 
restriction site had been destroyed. The primers 
used for this amplification were designed to 
preserve an EcoR I site, a unique Kozak sequence, 
and the mouse B72.3 heavy chain leader sequence on 
the 5* end of the amplified fragment and the Sst I 
cloning site on the 3' end of the amplified 
fragment. The templates used in this PCR reaction 



WO 96/02273 PCT/US95/08743 

- 115 - 

were the two purified PCR reaction products 
described above . The PCR reaction and subsequent 
purification of the PCR product were performed as 
described in Example 4al . 

5 

Table 10 

SEQ 

ID NO Primer 

(141) 1 HC-1 (F) 5' CAGGTTCAGCTGGTTCAGTCCGGGG 

CT 3" 

10 (142) 2 HC-2 (B) 5' CCTTG GAGCTC ACGATGACCGTGGT 

TCCTTGGCCCCAGACGTCC3 * 

(143) 3 HC-3 (F) 5' GGCCGCGAATTCGCCGCCACCATGG 

AATGGAGCTGGGTCTTTCTCTTCTT 
CCTGTCAGTA 3' 

(144) 2 HC-4 (B) 5' AGCCCCGGACTGAACCAGCTGAAC 

CTG 3* 

(145) 4 HC-5 (F) 5' GGAGTTGAGGAGC CTCAGGT CTGCA 

GACACGG 3» 

(146) 4 HC-6 (B) 5' CCGTGTCTGCAGACCTGTGGCTCCT 

CAACTCC 3' 

15 (147) LC-1 (F) 5' GATGCCAGATGTGAGAT CGTTCTCA 

CGCAGTCT 3* 

(14 8) 3 ' 5 LC-2 (B) 5 1 GCGGGATCC GAATTC TCTAGAATTA 

ACACTCTCCCCTGTTGAAGCTCTTT 
GTGACGGG CGAACTCAG 3 ' 

(149) 3 LC-3 (F) 5 1 GCG CGAATTCA CCATGGGTGTGCCC 

ACTCAGGTCCTGGGGTTGCTGCTGC 
3' 



(150) LC-4 



(B) 



5' AGACTGCGTGAGAACGATCTCACAT 
CTGGCATC 3 1 
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(151) 6 LC-5 (F) 5' GCG CAAGCTTA CCATGGGTGTGCCC 

ACTCAGGTCCTGGGGTTGCTGCTGC 
3 » 



F Forward Primer 
5 B Backward Primer 

1 the Sst I cloning site is single underlined 

2 the primers, HC-2 and HC-4 contain 
complementary sequences 

3 the EcoR I cloning site is single underlined 
10 4 in HC-4, the G that is double underlined was 

altered from an A to eliminate a Bgl II 
restriction site? in HC-5, the C that is 
doubleunderlined was altered from a T to 
eliminate a Bgl II restriction site 
15 5 the base A that is double underlined was 

introduced in the PCR primer to alter the 
encoded amino acid from an arginine, R, to a 
serine, S 

6 the Hindlll cloning site is single underlined 

20 

3) Insertion of Modified b!2 Heavy 

Chain Variable Region into the pSG-5 
Mammalian Expression Vector 
The modified bl2 heavy chain 
25 variable region PCR product was ligated into a 

mammalian expression vector (Adair, et al . , Hum. 
Ant ibod . Hvbridomas , in press) . The mammalian 
expression vector consisted of the pSG-5 vector 
(Figure 24) with a human IgGl gene inserted at the 
30 EcoR I site. The human IgGl gene contained a VH 

insert in the same reading frame as the constant 
regions of the human IgGl gene . The VH insert was 
removed by digestion with EcoR I and Sst I enzymes. 
The constant regions (CHI, CH2, and CH3) remained 
35 in the pSG-5 vector. Transcription of the heavy 



chain gene in the pSG-5 expression vector is under 
the control of the SV40 early promoter. 
Transcriptional termination is signaled by the SV40 
polyadenylation signal sequence downstream of the 
heavy chain sequence. The M13 intergenic region 
allows for the production of single-stranded DNA 
for nucleotide sequence determination. 

The modified bl2 heavy chain variable region 
PCR product was digested with EcoR I and Sst I and 
purified on a 2.5% Nu-Sieve agarose gel (FMC) . The 
mammalian expression vector DNA containing the IgGl 
sequences was digested in parallel with EcoR I and 
Sst I enzymes to remove the original VH region. 
The PCR modified heavy chain variable region was 
ligated to the constant regions in the mammalian 
expression vector using T4 DNA ligase under 
conditions well known to those of skill in the art 
and transformed into DH5or competent cells following 
the manufacturer's recommended procedures (GIBCO, 
BRL Life Technologies, Gaithersburg, MD) . The PCR 
modified heavy chain variable region was inserted 
in the same reading frame as the constant regions 
of the human IgGl gene in the pSG-5 vector. 
Miniprep DNAs were analyzed and large scale plasmid 
preparations performed. The nucleotide sequence of 
the 5* untranslated region including the Kozak 
sequence, mouse B72.3 heavy chain leader sequence, 
heavy chain variable region, heavy chain constant 
regions, and SV40 signal sequence was determined by 
the dideoxy-nucleotide chain termination method 
(Sanger et al., supra ) . 

b. Construction of a b!2 Light Chain pSG-5 
Mammalian Expression Vector 
1) Modification of b!2 Light Chain to 
Introduce a Kozak Sequence, 
Mammalian Leader Sequence, and Human 
Light Chain Consensus Sequence 
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The bl2 light chain was cloned into 
a separate pSG-5 expression vector (Green et al . , 
supra ) . The double -stranded Fab b!2 DNA was used 
as a template for isolating the gene encoding the 
light chain of the Fab bl2, the amino acid residue 
sequence the light chain of Fab bl2 is listed in 
SEQ ID NO 97. Mouse B73.2 IgGl DNA (Whittle, et 
al., Protein Eng. , 1:499 (1987) and Bruggmeman, et 
al., J . Exp . Med . , 166:1351 (1987)) was used as a 
template for isolating the mouse B73.2 leader 
sequence . Fab bl2 and mouse B73 . 2 IgGl DNA were 
thus used as templates for a PCR amplification for 
the construction of a DNA fragment consisting of 
the unique Kozak sequence for control of light 
chain expression, the mouse B72.3 light chain 
leader sequence (MGVPTQLGLLLWLTDARC (SEQ ID NO 153 
from amino acid residue sequence 1 to 20)), and the 
bl2 light chain beginning with a human light chain 
amino acid consensus sequence (EIVLTQSP (SEQ ID NO 
153 from amino acid residue sequence 21 to 28) ) . 
Altering the beginning of the light chain from the 
mouse amino acid consensus sequence to the human 
amino acid consensus sequence also destroys the 
original Sac I cloning site. The restriction site, 
EcoR I, was introduced in the amplification 
reactions and was located at both the 5 1 and 3' 
ends of the fragment. The procedure for creating 
this fragment by combining the products of two 
separate PCR amplifications is described below. 

The primer pair, LC-1 (SEQ ID NO 163) and LC-2 
(SEQ ID- NO 164), was used in the first PCR reaction 
as performed above to amplify the Fab bl2 light 
chain gene and incorporate the human light chain 
consensus sequence into the fragment and the EcoR I 
cloning site into the 3' end of the bl2 light chain 
fragment. For the PCR reaction, 1 ^1 containing 
100 ng of Fab bl2 DNA was admixed with 10 pi of 10X 
PCR buffer in a 0.5 ml microfuge tube. To the DNA 



admixture, 8 /il of a 2.5 mM solution of dNTPs 
(dATP, dCTP, dGTP, dTTP) was admixed to result in a 
final concentration of 200 /iM of each dNTP. 1 /xl 
(equivalent to 20 pM) of the LC-1 primer and 1 jzl 
(20 pM) of the 3' backward LC-2 primer was admixed 
into the DNA solution. To the admixture, 73 jxl of 
sterile water and 2.5 units of Taq DNA polymerase 
was added. Two drops of mineral oil were placed on 
top of the admixture and 35 rounds of PCR 
amplification in a thermocycler were performed. 
The amplification cycle consisted of 52°C for 1 
minute, 72°C for 2 minutes and 94 °C for 0.5 
minutes . 

The primer pair, LC-3 {SEQ ID NO 165) and LC-4 
(SEQ ID NO 166) as shown in Table 10, was used in a 
separate PCR reaction to amplify the mouse light 
chain B72.3 leader sequence and incorporate an EcoR 
I cloning site at the 5 ' end of the fragment and to 
introduce a 27 base pair sequence which has 
homology to the modified light chain fragment 
prepared above. Double -stranded DNA encoding the 
mouse B73.2 IgGl (Whittle, et al., supra ) was used 
as a template for preparation of the mouse 72.3 
leader sequence. The PCR reaction to prepare the 
mouse leader sequence fragment was performed using 
the same conditions as described in Example 4a for 
the preparation of the modified VH fragment. 

The resultant PCR modified bl2 light chain DNA 
fragment and light chain mouse leader sequence 
fragment were purified by electrophoresis in a 2.5% 
Nu-Sieve agarose gel (FMC) . The area in the 
agarose containing the modified bl2 light chain DNA 
fragment and light chain mouse leader sequence 
fragment were excised from the agarose. 

A third PCR amplification using the primer 
pairs, LC-1 (SEQ ID NO 157) and LC-4 (SEQ ID NO 
166) as shown in Table 10, was performed to fuse 
the light chain mouse leader fragment with the 
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modified light chain fragment. The primers used 
for this amplification were designed to preserve an 
EcoR I site, a unique Kozak sequence, and the mouse 
B72.3 light chain leader sequence on the 5' end of 
the amplified fragment and to preserve the EcoR I 
cloning site on the 5' end of the amplified 
fragment. The templates used in this PCR reaction 
were the two purified PCR reaction products 
described above. The PCR reaction and subsequent 
purification of the PCR product were performed as 
described in Example 4al . 

2) Insertion of Modified b!2 Light 
Chain into dSG-5 Mammalian 
Expression Vector 
The modified bl2 light chain PCR 
product was ligated to a pSG-5 vector {Figure 24) . 
The pSG-5 vector had the same features described in 
Example 4a2 but did not contain a human IgGl gene. 

The modified bl2 light chain PCR product was 
digested with EcoR I and purified on a 2.5% Nu- 
Sieve agarose gel (FMC) . The pSG-5 vector DNA was 
digested in parallel with EcoR I enzyme. The PCR 
modified light chain was ligated to the pSG-5 
vector using T4 DNA ligase (New England Biolabs, 
Beverly, MA) and transformed into DH5cr competent 
cells (GIBCO, BRL Life Technologies, Gaithersburg, 
MD) following manufacturer's instructions. 
Miniprep DNAs were analyzed and isolation of 
plasmid DNA performed. The nucleotide sequence of 
the light chain gene was determined using the 
dideoxy-nucleotide chain termination method (Sanger 
et al., supra ) . The nucleotide sequence of the 5' 
untranslated region, mouse B72.3 light chain leader 
sequence, light chain variable region, light chain 
constant region, and SV4 0 signal sequence was 
obtained. The nucleotide and amino acid residue 
sequences are illustrated in Figures 25A and 25B 
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and are given in the sequence listing as SEQ ID NOs 
152 and 153. 

c . Transient Expression of b!2 Heavy and 
5 Light Chain Genes in pSG-5 Vectors in 

COS-7 Cells 

1) Transient Expression of b!2 IaGl in 
COS-7 Cells 

The human heavy and light chains in 

10 the separate pSG-5 expression vectors were 

cotransf ormed and transiently expressed in COS-7 
cells. COS-7 cells (SV40 transformed African Green 
Monkey Kidney Cells) provide a rapid and convenient 
method to test the expression and function of the 

15 antibody genes. The COS-7 cells constituitively 

express the SV40 large T antigen which supports the 
transient replication of episomes carrying the SV40 
origin of replication. The pSG-5 expression vector 
has an SV40 origin of replication. Upon 

20 transfection into COS-7 cells, the expression 

vectors are replicated in the nucleus to a high 
copy number, resulting in relatively high transient 
expression levels. 

COS-7 cells were obtained from the American 

25 Type Culture Collection (CRL 1651) and cultured in 

Dulbecco's modified Eagle's medium (DMEM) , 
supplemented with 10% fetal bovine serum (GIBCO 
BRL, Gaithersburg, MD) and 1% penicillin, and 1% 
streptomycin. Transf ections were performed with 10 

30 fig of plasmid DNA per 100 mm tissue culture plate 

containing 1 x 10 6 cells. The control plate was 
transfected with plasmid vector DNA without an 
insert. The plates were incubated at 37°C after 
transfection. The supernatants were harvested at 

35 48 hours and tested for gpl20 binding specificity 

in an ELISA assay. 
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2) ELISA Assay for the Detection 

of Binding of b!2 IaGl to qpl20 
Supernatants from COS-7 
transf ormants were tested for binding to gpl20 in 
an ELISA assay. Briefly, the ELISA plate was 
coated with recombinant IIIB gpl20 antigen at a 
concentration of 1 pg/ml . The serially diluted 
supernatant containing the bl2 antibody was added 
to the wells and incubated at 37°C for 1 hour. 
After washing the plate to remove unbound antibody, 
a goat ant i- human Ig Fc horse radish peroxidase 
(HRP) conjugated secondary antibody was added and 
incubated for an additional hour. An OPD substrate 
for the HRP conjugated antibody was added and the 
HRP activity detected by determining the absorbance 
at 4 90 nm. 

d. Insertion of the b!2 Heavy Chain IqGI 

into the pEE6 Mammalian Expressi on Vector 
to Create pEe6HC BM 12 
After confirmation that the antibody 
molecule expressed by the heavy and light chain 
pSG-5 expression vectors bound gpl20 as described 
in Example 4c, the heavy chain was removed from the 
pSG-5 vector and ligated into the pEE6 mammalian 
expression vector (Bebbington et al. f 
Bio/Technolocrv , 10:169 (1992)). The pEE6 vector 
(Celltech, England) contains an HCMV promoter and 
the glutamine synthetase gene (GS) . The pEE6 
vector was chosen because of the GS gene which 
serves as a selectable marker. CHO cells are 
devoid of GS activity and thus are dependent on a 
supply of glutamine in the culture medium. Cells 
transfected with the pEE6 vector containing the GS 
gene are able to synthesize glutamine from 
glutamate and can survive in the absence of 
glutamine in the culture medium. For CHO cells, 
the addition of methyl sulfoxamine (MSX) leads to 
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amplification of the transfected plasmid DNA. 

The heavy chain pSG-5 vector was digested with 
EcoR I and Bgl II to remove the 5' untranslated 
region including the unique Kozak sequence, mouse 
heavy chain B72.3 leader sequence, and heavy chain 
variable and constant regions from the pSG-5 
vector. The pEE6 vector was also digested with 
EcoR I and BamH I . Both the vector and heavy chain 
DNAs were analyzed on a 0.7% low melting point 
agarose (LMPA) gel. The 3.5 kb heavy chain band 
and the 4.68 kb pEE6 vector band were excised from 
the gel and ligated together in the presence of the 
LMPA at 15 °C overnight with 1 /il of T4 DNA ligase 
and 1 /xl of 10X ligase buffer (New England Biolabs, 
Beverly, MA) . Upon ligation, the EcoR I site is 
reconstituted but the BamH I and Bgl I I sites are 
destroyed. Prior to transformation, 5 ^tl of the 
ligated DNA in LMPA was diluted with 20 /il of TCM 
buffer (10 mM tris, 10 mM CaCl 2 , and 10 mM MgCl 2 ) . 
Only 10 /il of the 25 /il was used for the 
transformation. The ligated circular plasmid DNA 
construct was transformed into maximum efficiency 
DHSor competent cells. The standard protocol for 
transformation was used, wherein the DNA and 100 /il 
of the competent bacterial mix (GIBCO BRL, 
Gaithersburg, MA) were incubated on ice for 20 
minutes and heat shocked at 42 °C followed by 
incubation on ice for 2 minutes. About 900 /il of 
SOC (GIBCO BRL, Gaithersburg, MA) was added to the 
transformation. Only 100 /il of the 1000 /il of the 
transformed cells was plated on LB with 
carbenicillin plates (carbenicillin at 50 ng/ml) . 
The plates were incubated at 37°C overnight. 
Twelve individual colonies were picked for miniprep 
analysis. Several diagnostic digests confirmed the 
presence of the heavy chain insert. Plasmid DNA 
was isolated on a CsCl gradient (Sambrook et al . , 
supra ) . The nucleotide and amino acid residue 
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sequences are illustrated in Figures 27A through 
27E and the nucleotide and amino acid residue 
sequences are given in the sequence listing as SEQ 
ID NOs 154 and 155 . 

e. Insertion of the b!2 Light Chain i nto the 
pEE12 Mammalian Expression Vector 
The light chain was ligated into the 
pEE12 vector (Celltech, England) from the pSG-5 
vector involving similar steps as described in 
Example 4d for the heavy chain. The pEE12 vector 
has a human CMV promoter for expression of the 
light chain, a polylinker to provide cloning sites, 
and a polyadenylation signal for termination of 
transcription. The vector also contains the GS 
selectable marker gene, whose expression is 
controlled by an SV40 early promoter at the 5 ! end 
of the GS gene, an intron, and a polyadenylation 
signal at the 3' end of the GS gene. 

1) Preparation of Modified b!2 Light 
Chain 

The 5 ' PCR primer was designed to 
replace the EcoR I cloning site with a Hindlll 
cloning site. The 3* PCR primer maintained the 
EcoR I cloning site. 

The primer pair, LC-5 (SEQ ID NO 167) and LC-2 

(SEQ ID NO 165) , was used in the PCR reaction as 
described in Example 4al to amplify the Fab b!2 
light chain gene and incorporate Hindi I I and EcoR I 
cloning sites into 5' and 3' ends of the fragment, 
respectively. The bl2 pSG-5 vector containing the 
bl2 light chain was used as the template in the PCR 
reaction. For the PCR reaction, 1 fil containing 
100 ng of bl2 pSG-5 DNA was admixed with 10 pi of 
10X PCR buffer in a 0.5 ml microfuge tube. To the 
DNA admixture, 8 pi of a 2 . 5 mM solution of dNTPs 

(dATP, dCTP, dGTP, dTTP) was admixed to result in a 
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final concentration of 200 micromolar (jzM) of each 
dNTP. 1 /xl (equivalent to 20pM) of the LC-5 primer 
and ljzl (20 pM) of the 3* backward LC-2 primer was 
admixed into the DNA solution. To the admixture, 
73 pi of sterile water and 2.5 units of Taq DNA 
polymerase was added. Two drops of mineral oil 
were placed on top of the admixture and 3 5 rounds 
of PCR amplification in a thermocycler were 
performed. The amplification cycle consisted of 
52°C for 1 minute, 72°C for 2 minutes and 94°C for 
0.5 minutes. 

The resultant PCR modified bl2 light chain DNA 
fragment was purified by electrophoresis in a 2.5% 
Nu- Sieve agarose gel (FMC) . The area in the 
agarose containing the modified bl2 light chain DNA 
fragment was isolated from the agarose. 

2) Insertion of the Modified b!2 Light 
Chain into the pEE12 Mammalian 
Expression Vector 
The modified bl2 light chain 
purified PCR product and the pEE12 vector were 
digested with Hindi I I and EcoR I in separate 
reactions. The digested DNAs were analyzed on an 
LMPA gel, the DNA excised, and ligated together in 
the presence of the LMPA gel as described for the 
heavy chain construct in Example 4d. The ligation 
products were transformed into DH5of competent 
cells, minipreps analyzed, and DNA prepared as 
described for the heavy chain constructs in Example 
4d. 

f . Insertion of the Modified b!2 Heavy Chain 
into the PEE12 Mammalian Expression 
Vector Containing the b!2 Light Chain to 
Create the Combinatorial Vector pEe!2 
Combo BM 12 

A heavy chain cassette comprising the 
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HCMV promoter, enhancer elements, heavy chain gene, 
and polyadenylation signal were removed from the 
pEE6 vector and inserted into the pEE12 vector 
containing the bl2 light chain gene, prepared in 
Example 4e, to generate the combinatorial 
construct, pEel2 Combo BM 12, containing both the 
bl2 light and heavy chain genes (Figure 28) . 

The heavy chain cassette was removed from the 
pEE6 vector by digestion with Bglll and Sal I. The 
pEE12 vector containing the light chain gene, 
prepared in Example 4e, was also digested with 
Bglll and Sal I. The heavy chain cassette and the 
pEE12 vector containing the light chain gene from 
Example 4e were ligated together at the Bglll and 
Sal I sites as described in Example 4d. The 
combinatorial construct was transformed into DH5a 
competent cells and miniprep DNA was analyzed for 
the presence of the heavy and light chains as in 
Example 4d. The nucleotide sequence of the heavy 
and light chain genes was determined. The 
nucleotide sequence of pEel2 Combo BM 12, the pEE12 
vector containing the bl2 heavy and light chain 
genes is given in the sequence listing as SEQ ID NO 
156 and is illustrated in Figures 29A through 29R. 

g. opl2 0 Binding of b!2 IgGl Antibody 

Expressed from the Heavy and Light Chain 
Genes in the Combinatorial Vector pEe!2 
Combo BM 12 

The combinatorial pEel2 Combo BM 12 
vector containing both the heavy and light chain 
genes was used to transfect CHO cells. Stable 
clones were selected in Glasgow Minimal Essential 
Media (GIBCO) supplemented with 10% dialyzed fetal 
bovine serum and 50 /iM methyl sulfoxamine (MSX) . 
Several clones were isolated and expanded in 6 -well 
cluster dishes. The supernatants of subconfluent 
cultures were harvested and tested by EL ISA for 
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binding to gpl20 as described in Example 4c2 . The 
clone producing the highest levels of b!2 IgGl as 
determined by EL ISA with gpl20 IIIB was chosen for 
further study. The antibody was purified by 
affinity chromatography using protein A as 
described in Sambrook, et al.. , supra . The affinity 
of bl2 IgGl for gpl20 IIIB as measured by surface 
plasmon resonance as described in Example 2b6c is 
1.3 x 10 9 M* 1 . 

5 • Neutralizing Activity of Recombinant b!2 Whole 
IgGl Antibody (b!2 IaGl) Against HIV-1 In 
Vitro 

The key issue in producing antibodies to HIV-1 
for therapeutic or prophylactic purposes is that 
they should be highly potent (of high affinity and 
neutralizing ability) and be cross reactive with a 
wide range of primary clinical (field) isolates. 
These are generally two opposing characteristics. 
The ability of bl2 whole IgGl antibody (bl2 IgGl) 
to neutralize the infectivity of laboratory strains 
of HIV-1 and a wide variety of primary clinical 
isolates has been examined in p24 EL ISA assays, 
microplaque assays, and by syncytial formation 
assays . 

The primary clinical isolates used as a source 
of HIV-l virus in these assays came from various 
regions of the world by three organizations: the 
World Health Organization (WHO) , the Henry M. 
Jackson Foundation for the Advancement of Military 
Medicine (HMJFAMM) , and the National Institute of 
Allergy and Infectious Diseases (NIAID) . Isolates 
from the WHO Network for HIV-1 Isolation and 
Characterization were obtained through the AIDS 
Research and Reference reagent Program, Division of 
AIDS, NIAID,. NIH. Isolates from HMJFAMM were 
provided by Dr. John Mascola, Walter Reed Army 
Institute of Research, Rockville, MD and Dr. 
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Francine McCutchan, Henry M. Jackson Research 
Laboratory, Rockville MD. Isolates from NIAID were 
kindly provided by Dr. Jim Bradac, Division of 
AIDS , NIAID, NIH. 

The HIV-1 viruses were collected from various 
regions of the world, expanded in mitogen- 
stimulated peripheral blood mononuclear cells 
(PBMC) (Mascola et al . , J. Infect. Pis. . 169:48-54 
(1994)), and culture supernatants containing 
infectious virus were stored in central 
repositories at -70°C. The designation of viruses 
into clades was made on the basis of sequence 
information based on the gag gene or on the V2-C5 
region of gpl20, or in some cases, after 
heteroduplex mobility analysis (Louwagie et al . , 
AIDS . 7:769-772 (1993) and Delwart et al . , Science , 
262:1257-1261 (1993)). 

The HIV-1 viruses include a set of 14 primary 
isolates which contain a high proportion of 
isolates which are relatively refractory to 
antibody neutralization by sera from other HIV-1 
infected individuals (Wrin et al., J. Acq. Imm. 
Def. Svnd. , 7:211-219 (1994)), 12 primary infant 
isolates obtained at birth or within two weeks of 
age, and 69 international isolates belonging to 6 
different clades. 

Several different neutralization assays were 
performed because HIV-1 neutralization by antibody 
shows considerable variation depending upon the 
assay used and the precise experimental conditions 
such as inoculum size and incubation time of virus 
and antibody (D'Souza et al . , AIDS , 8:169-173 
(1994)). By performing neutralization assays on a 
range of laboratory and primary isolates in a 
number of different laboratories, it has been 
demonstrated that bl2 IgGl is a highly potent 
neutralizing antibody effective against a wide 
breadth of isolates. 
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a. Quantitative Neutralization of HIV-1 MN 
and Illb by b!2 IqGI as Measured in a 
Plague Assay 

bl2 IgGl was initially tested for its 
ability to neutralize the HIV-1 laboratory strains 
MN and IIIB in a plaque formation assay in 
laboratories which recently tested a panel of 
monoclonal antibodies as part of the NIAID/WHO 
Antibody Serological Project (D'Souza et al., 
supra ) . 

bl2 IgGl showed 50% neutralization titers of 3 
ng/ml for the MN strain and 7 ng/ml for the IIIB 
strain using plaque formation (Hanson, et al . , J . 
Clin. Microbiol. . 28:2030-2034 (1990)) to determine 
the ability of the antibody to inhibit infectivity 
of the HIV-1 strains. 

b. Quantitat ive Neutralization of HIV-1 MN 
and IjjEfr fry frj.2 IgGl as Measured by 
Syncytial Formation 

bl2 IgGl showed 50% neutralization titers 
of 20 ng/ml for both MN and IIIB strains using 
syncytial formation as the reporter assay as 
described in Example 3b (Nara et al . , AIDS Res. 
Human Retroviruses , 3:283-302 (1987)). 

The syncytial formation assay was performed as 
described in Example 5c. Briefly, virus was grown 
in H9 cells. For infectivity measurement, 
monolayers of CEM-SS target cells were cultured 
with 100-200 syncytial forming units (SFUs) of 
virus, in the presence or absence of antibody, and 
the number of syncytia determined after 3-5 days of 
incubation. The assays were repeatable over a 
virus- surviving fraction range of 1 to 0.001 within 
a 2 to 4-fold difference in the concentration of 
antibody (P<0 . 001) . 
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c . Neutralization of Primary Virus Isolates 
bv b!2 IoGl as Measured bv the p24 ELISA 
Assay 

The ability of bl2 IgGl to neutralize 
5 infectivity of PBMCs by HIV-1 virus was 

quantitatively measured in the p24 ELISA assay 
(Daar et al., Proc. Natl. Acad. Sci. U.S.A. . 
87:6574-6578 (1990) and Ho et al., J. Virol. . 
65:489-493 (1991)). The p24 ELISA assay is further 
10 described in Example 3a. 

1) Neutralization of Ten Primary Virus 
Isolates by b!2 IaGl 
HIV-1 viruses were isolated from 10 

15 individuals from various locations in the U.S. and 

with varying disease status. The HIV-1 viruses had 
been cultured only once or twice in peripheral 
blood mononuclear cells (PBMCs) . Viral stocks were 
grown in PBMCs and the assay was performed in 

20 PBMCs. 

Briefly, HIV-1 virus at 50 TCIDjq and varying 
concentrations of bl2 IgGl were incubated together 
for 30 min at 37°C before addition to PHA- 
stimulated PBMCs. HIV-1 virus replication was 

25 assessed after incubation for 5 to 7 days by p24 

ELISA measurement as described in Example 3a. HIV- 
1 virus positive controls used in this assay were 
the molecularly cloned HIV-1 virus JR-CSF and the 
HIV-1 isolate JR-FL (O'Brien et al . , J. Virol. , 

30 66:3125-3130 (1992), O'Brien et al . , Nature . 

348:69.-73 (1990), and O'Brien et al . , J. Virol . . in 
press (1994)). Stocks of JR-CSF were prepared by 
infection of PBMC with supernatants initially 
obtained by DNA transf ection. HIV-1 IIIB and HIV-1 

35 MN are viruses with an extensive history of passage 

in transformed T-cell lines (Robert -Guroff et al . , 
Nature , 316:72-74 (1985)). Stocks of these strains 
grown in H9 cells were passaged in mitogen- 
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stimulated PBMC to prepare viruses that had been 
grown in the same cells as the primary viruses, to 
eliminate the influence of any host cell -dependent 
epigenic factors on virus neutralization (Wrin, et 
al., J. Acq. Imm. Def. Svnd. . 7:211-219 (1994)). 
The stock of PBMC-grown MN was a gift from A. N. 
Conley (Merck Research Labs) . 

2) Neutralization of 12 Primary 
Infant Isolates bv b!2 IgGl 
bl2 IgGl was also tested for 
the ability to neutralize infectivity of a panel of 
12 primary infant isolates in the p24 ELISA assay. 
Virus isolates were obtained from 12 infants born 
to HIV-1 seropositive mothers; 7 were obtained at 
birth and 5 between birth and 14 days of age. All 
the infants were from California. Virus was 
isolated from patient PBMCs by coculture with PBMCs 
from healthy seronegative donors. Viral stocks 
were prepared by passaging the last positive 
culture dilution once into PBMCs. All of the 
isolates, except one (isolate 7) , were non- 
syncytial inducing in MT2 cells and therefore could 
not be assayed in the syncytial forming assay as 
herein described. HIV-1 virus from these stocks 
was grown in PBMCs and neutralization assessed 
using PHA- stimulated PBMCs as indicator cells and 
determination of extracellular p24 as the reporter 
assay essentially as described in Example 3a (AIDS 
Clinical Trials Group Virology manual for HIV 
Laboratories, Department of AIDS Research, NIAID, 
NIH, version 2.0 (1993)). 

Serial dilutions of bl2 IgGl (0.3 to 20 jig/ml) 
were incubated with 2 0 TCID 50 or 100 TCID^ virus in 
triplicate for 2 hours at 37°C before addition to 
PHA-stimulated PBMCs. Virus replication was 
assessed after 5 days by p24 ELISA measurement. 
Neutralization was expressed as either a 50% or .90% 



reduction in p24 antigen as compared to values 
observed in the absence of antibody (Table 6) . 

d. Neutralization of Primary Virus Isolates 
by b!2 IaGl as Measured in a Microplaque 
Assay 

A quantitative microplaque assay to 
measure the reduction of infectivity of primary 
clinical isolates of HIV-1 in the presence of the 
b!2 IgGl and pooled human plasma was performed as 
described in Hanson et al . , J. of Clin. Microb. , 
2030-2034 (1990) . The set of primary clinical 
isolates was chosen to contain a high proportion of 
isolates which are relatively refractory to 
antibody neutralization by sera from other HIV-1 
infected individuals (Wrin et al., J . Acq. Imm. 
Def . Synd. . 7:211-219 (1994)). Viruses were grown 
in PBMCs and the assay carried out in MT2 cells. 
This limits study to viruses which grow in this 
cell line but provides an additional measure of 
neutralization . 

Primary clinical isolates of HIV-1 were 
isolated from frozen peripheral blood lymphocytes 
obtained from seropositive donors as described in 
Gallo et al., J. of Clin. Microb. . 1291-1294 (1987) 
and cultivated in peripheral blood mononuclear 
cells (PBMC) . Briefly, HIV isolates were obtained 
by incubating frozen HIV-infected patient PBMCs 
with seronegative donor PBMCs in RPMI-1640 medium 
containing 20% heat -inactivated fetal bovine serum, 
2 tig/ml polybrene, 5% interleukin-2 , and 0.1% anti- 
human leukocyte interferon. The cultures were fed 
with fresh donor PBMCs once a week, and the 
supernatants were assayed for the presence of 
reverse transcriptase (RT) activity beginning at 
day 11. The cultures were considered positive if, 
for 2 consecutive weeks, the RT counts were >10- 
fold higher than those in the cultures of the 
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seronegative donor PBMCs alone. 

The resultant RT-positive virus isolates were 
tested for cytolysis in the MT4 (a- 4 clone) (Hanson 
et al. , supra ) . Cytolysis in MT4 is a requirement 
for viruses to be usable in the subsequent MT2 
microplaque assay system. Supernatant fluids from 
the primary PBMC isolation cultures were used to 
infect expanded cultures of phytohemagglutinin 
(PHA) -stimulated PBMCs from healthy seronegative 
blood donors. These infected PBMC cultures were 
grown in RPMI-1640 medium supplemented with 15% 
fetal bovine serum, 5% interleukin-2, 0.1% anti-or 
interferon, 2 /xg/ml polybrene, 50 /ig/ml gentamicin, 
100 U/ml penicillin, and 100 fig/ml streptomycin. 
The crude supernatants were harvested after 7 days 
and frozen as viral stocks at -70°C. 

The primary clinical isolates of HIV-l used in 
this microplaque assay are given in Table 6 . 
VL134, VL64 8, and VL025 are viruses isolated from 
infected mothers in New York in 1992; UG2 66 and 
UG274 are clade D isolates which were a gift from 
John Mascola the Division of Retrovirology, Walter 
Reed Army Institute of Research; the remaining 
viruses were isolated from homosexual males in 
California in 1992. The pooled human plasma 
preparation, containing neutralizing antibody, was 
derived from 13 HIV-l positive individuals selected 
for high neutralization titer against the MN 
isolate. The laboratory HIV-l strains MN and Illb 
were propagated in H9 cells as controls in the 
microplaque assay. 

bl2 IgGl and a pool of human plasma from 13 
HIV-l seropositive patients were used as the source 
of neutralizing antibodies in a 96-well microtiter 
plaque reduction assay as described by Hanson et 
al., supra . Briefly, 3-fold serial dilutions of 
the bl2 IgGl or heat -inactivated pooled patients* 
plasma were combined in quadruplicate with an equal 



volume containing 20 plaque- forming units (PFU) of 
HIV-1 virus per well and incubated for 18 hours at 
37°C. Negative control wells also contained 50% 
normal human serum pool with no patient immune 
serum. After the 18 hour incubation of Fabs or 
serum and virus, 90,000 MT2 cells were added per 
well and incubated at 3 7°C for 1 hour. SeaPlaque 
Agarose in assay medium at 39.5°C was then added to 
a final concentration of 0.8%. While the warm 
agarose was still molten, the microtiter plates 
were centrifuged at 20°C for 20 minutes at 500 X g 
to form cell monolayers. The plates were incubated 
for 6 days at 37°C and then stained 18 to 24 hours 
with 50 iig/ml propidium iodide. The fluorescent 
plaques were counted with transillumination by a 
304 nm ultraviolet light source using a low-power 
stereo zoom microscope. Inhibition of infectivity, 
or neutralization titer, is defined as the /xg/ml of. 
Fab or the plasma dilution giving 50% inhibition of 
plaque count as compared with controls without 
antibody. This dilution was interpolated between 
data points . 

e. Results of the Neutralization Assays bv 
b!2 IcrGl with Laboratory Virus Isolates 
Results of the ability of the bl2 IgGl to 
neutralize laboratory virus isolates in both the 
plaque and syncytial formation assays suggest the 
antibody is approximately two orders of magnitude 
more potent than other CD 4 site antibodies in the 
WHO/NIAID Project and comparable to the best 
antibodies directed to the V3 loop of gpl20. 
However, whereas antibodies directed to the V3 loop 
of gpl20 are strongly strain specific, bl2 IgGl is 
roughly equally effective against MN and IIIB. The 
bl2 IgGl antibody is comparable in potency to a 
CD4-IgG molecule in these assays {Example 3c) . In 
a separate assay using p24 production to determine 
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infectivity (Daar et al. f Proc. Natl. Acad. Sci. 
U.S.A. . 87:6574-6580 (1990) and Ho et al . , i_ 
Virol . . 65:489-493 (1991)), 50% neutralization 
titers of less than 4 0 ng/ml were found for both 
the MN and IIIB laboratory strains. 

f . Results of the Neutralization Assays by 
b!2 IaGl with Primary Virus Isolates 
b!2 IgGl showed essentially complete 
neutralization of 7 of 10 isolates at 5 /zg/ml with 
all the isolates showing 50% neutralization at si 
/ig/ml as determined in the p24 reporter assay 
(Figure 21) . 

The inhibition of infectivity, or 
neutralization titer, for bl2 IgGl and the pooled 
HIV seropositive human plasma from 13 donors is 
given in Table 6. The neutralization titer for 
each of the viral isolates is expressed as the 
minimum fig/ml of bl2 IgGl required for 50% 
inhibition of plaque count as compared to the 
controls. The neutralization titer for each of the 
viral isolates is expressed as the minimum titer of 
the pooled HIV seropositive human plasma from 13 
donors required for 50% inhibition of plaque count 
as compared to the controls. 
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Table 6 



bl2 IgGl 50% 
host neutralization 
virus cell titer (ua/ml) 

IIIB H9 0.007 

MN H9 0.003 



pooled human 
plasma: dilu- 
tion for 50% 
neutralization 

1 :767 

1 :24 , 000 



10 



15 



20 



25 



VL135 
UG274 
VL134 
VL596 
UG266 
VL434 
VL172 
VL750 
VL069 
VL077 
VL114 
VL263 
VL648 
VL025 



PBMC 
PBMC 
PBMC 
PBMC 
PBMC 
PBMC 
PBMC 
PBMC 
PBMC 
PBMC 
PBMC 
PBMC 
PBMC 
PBMC 



10 
0.7 
5.6 
8.5 
3.8 
22 
>200 
>200 
>50 
>200 
<7.4 
5.0 
16.7 
16.7 



1 :44 
1:37 
1:30 
1:17 
1:12 
1:10 
1:10 
1:10 
<1:10 
<1 :10 
<1 :10 
<1:10 
<1:10 
<1:10 



30 



The bl2 IgGl was able to neutralize ten of the 
fourteen primary clinical isolates assayed at 
concentrations of s50 fig/ml as measured as the /xg/ml 
required for 50% inhibition of plaque count as 
compared to the controls (Table 6) . Pooled human 
plasma was able to neutralize 5 of the 14 primary 
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clinical isolates assayed at >1:10 dilution as 
measured as the dilution required for 50% inhibition 
of plaque count as compared to the controls without 
antibody. 

Table 6 shows that four isolates, which were not 
neutralized even by a 1:10 dilution of pooled human 
plasma, were neutralized by bl2 IgGl . Most of the 
viruses reported in Table 6 were isolated from U. S. 
donors although two, both of which are neutralized by 
bl2 IgGl, were from Ugandan donors and assigned to 
clade D. 

Results of neutralization of 12 infant primary 
isolates with bl2 IgGl as determined by p24 EL ISA 
measurements are given in Table 7. 

Table 7 

bl2 IgGl 
Antibody Concentration (/xg/ml) 



: Isolate 


50% inhibition 


>?Q* p-^lpition 


1 


20 


>20 


2 


1.25 


>20 


3 


<0.3 


0.3 


4 


<0.3 


0.6 


5 


2.5 


20 


6 


5 


>20 


7 


5 


>20 


8 


<0.3 


0.3 


9 


0.3 


5 


10 


0.3 


2.5 


11 


<0.3 


0.6 


12 


<0.3 


0.3 
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As shown in Table 7, bl2 IgGl achieved 90% 
neutralization for 8 of 12 infant isolates at 
concentrations of s20 /ig/ml in the p24 -based assay. 
All 12 isolates were 50% neutralized in the range 
of 0.3 to 20 /ig/ral with the majority being 
neutralized at <5 /xg/ml . In contrast, a pooled 
hyperimmune globulin product HIVIG achieved 90% 
neutralization of only 3 or 12 isolates within a 
concentration range up to 100 /ig/ml . HIVIG is a 
hyperimmune IgG preparation obtained from the 
pooled plasma of selected HIV-l asymptomatic 
seropositive donors meeting the following criteria: 
presence of p24 serum antibody titers >128, CD4 
lymphocyte count a400 cells/jil and the absence of 
p24 and hepatitis B surface antigen by enzyme 
immunoassay (Cummins et al., Blood , 77:1111-1114 
(1991)). The HIVIG used in these experiments was 
lot number IHV-50-101 (North American Biologicals) . 

HIV-l neutralization by antibody shows 
considerable variation depending upon the assay 
used and precise experimental conditions such as 
inoculum size and incubation time of virus and 
antibody (D'Souza et al., supra ) . However, by 
carrying out neutralization on a range of 
laboratory and primary isolates in a number of 
assays in different laboratories, we have shown 
that bl2 IgGl is a highly potent neutralizing 
antibody effective against a wide breadth of 
primary isolates. The results clearly demonstrate 
that, although primary isolates may be more 
difficult to neutralize by antibody than laboratory 
strains, they are not intrinsically resistant 
(Conley et al., Proc. Natl. Acad. Sci.. U.S.A. . 
91:3348-3353 (1994)). The potency of bl2 IgGl 
against the majority of U. S. isolates is in a 
concentration range (s5 ^g/ml) which could be 
achieved in vivo in passive immunotherapy. 
Furthermore, the affinities of recombinant 



antibodies displayed on phage can be enhanced by 
mutagenesis and selection in vitro and this 
strategy has been used to considerably improve the 
potency and breadth of reactivity of Fab bl2 
(Barbas et al . , Proc. Natl. Acad. Sci.. U.S.A. , 
91:3809-3812 (1994)). For optimal potency and 
strain cross-reactivity for passive immunization, a 
cocktail of in vitro improved antibodies may be 
most appropriate . 

The results have implications for passive 
immunization and vaccine design. The ability of 
bl2 IgGl to neutralize a range of primary isolates 
implies conservation of a structural feature 
associated with the CD4 binding site of gpl20 which 
is accessible to antibody and important for 
neutralization. A vaccine might seek to present 
this feature to the immune system. Clearly, the 
feature is present on recombinant gpl20 since bl2 
was affinity selected from a library using this 
molecule. However, b!2 and related antibodies 
formed only a small part of the repertoire affinity 
selected from this library by recombinant gpl20. 
Most of the antibodies obtained were far less 
potent in neutralization even though they were also 
directed to the CD4 binding site, were cross - 
competitive with bl2 for binding to recombinant 
gpl20 and had similar affinities to bl2 (Barbas et 
al., pyoc. flatl. Acad t Scj., U.S.A., 89:9339-9343 

(1992) , Barbas et al . , J. Mol . Biol. , 230:812-823 

(1993) , and Example 2b6) (c) ) . Therefore, 
recombinant gpl20 appears to present the bl2 
epitope in conjunction with several other weakly 
neutralizing and overlapping epitopes and its 
efficacy as a vaccine may suffer. Interestingly, 
evidence from antibody binding to infected cells 
suggests that bl2 does recognize a native 
conformation of gpl20 more effectively than other 
CD 4 binding site antibodies (Example 7) . In any 
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case, bl2 IgGl and the library approach could be 
useful in vaccine and passive immunization 
evaluation. The ability of a candidate vaccine to 
preferentially bind bl2 and/or preferentially 
select potent neutralizing antibodies from 
libraries should be positive indicators for vaccine 
development . 

5 . Determination of the Relationship Between the 
Epitopes Recognized bv Fabs with Purified HIV- 
1 Antigens 

The Fabs show a spectrum of neutralizing 
abilities as described in Example 5. It was 
therefore sought to determine if the epitopes 
recognized by individual Fabs could be 
distinguished from each other, and if possible, 
determine how the epitopes recognized by the 
individual Fabs related to neutralization. 

a. Competitive ELISA between Fabs and b!3 

Whole IgGl Antibody for Binding to op120 
The first method to distinguish between 
the epitopes bound by the Fabs of this invention 
was to compare the epitope recognized by the Fab 
bl3 with the other Fabs. The Fab bl3 had been 
spliced to the Fc region of IgGl to generate a 
whole IgGl molecule and therefore contains the Fc 
region of the IgGl antibody. The other Fabs do not 
contain the Fc region of the IgGl antibody. The 
binding of the bl3 IgGl could therefore be 
distinguished from the binding of other Fabs by 
using a labeled anti-Fc reagent in competition 
ELISA. A competition ELISA in which the Fabs b3 , 
b6, bll, bl2, and bl4 competed with bl3 IgGl for 
binding to immobilized gpl20 was performed. 

Competitive ELISAs were performed between the 
Fabs b3, b6, bll, bl2, and bl4 and the bl3 whole 
IgGl antibody. The whole antibody was obtained by 
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splicing constant domain genes with the bl3 Fab and 
expressing the protein in Chinese Hamster Ovary 
cells (CHO) as described in Example 4g (Bender et 
al. # supra and in Example 4a for the Fab bl2) . The 
ELISA was performed as described above in Example 
2b6) (b) . Briefly, microtiter wells were coated 
with 0.1 /xg/ml of gpl20 derived from the HIV-1 
strain LAI in 0 . 1 M bicarbonate buffer at pH 8.6. 
Soluble or free Fab fragments were serially diluted 
from 1:100 to 1:32,000 in 0.5% BSA/0.025% Tween 
20/PBS. The dilution of bl3 IgGl was held constant 
at 1:10,000 in 0.5% BSA/0.025% Tween 20/PBS. The 
bl3 IgGl and Fabs were admixed, added to the gpl20- 
coated microtiter wells and maintained for 120 
minutes at 37°C. After maintenance, the wells were 
carefully washed ten times with 0.05% Tween 20/PBS. 
The amount of bl3 IgGl antibody bound to the plate 
after washing was detected using a peroxidase - 
labeled antibody specific for the Fc portion of 
IgGl contained on the bl3 antibody. 

Results of this assay indicated that the Fabs 
b3, b6, bll, bl2, and bl4 are competitive with bl3 
IgGl for binding to gpl20 indicating that the 
epitopes recognized by the individual Fabs are 
probably either proximal or identical to the 
epitope recognized by the bl3 IgGl. A control 
anti- tetanus toxoid Fab did not compete with IgGl 
bl3 in this assay. 

Competition monitored in an ELISA format 
showed that all of the Fabs compete with the bl3 
Fab as a whole IgG. There is also an indication 
that Fabs bl2 and bl3 are distinct in that they are 
somewhat less effective in cross-competition than 
the other members of the panel . 

b. Epitope Similarity Determination 

Between the Fabs in Binding to qpl20 
Using BIAcore 
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A more precise method for 
determining the similarity of epitopes was 
performed using the BIAcore. The procedure adopted 
here was to immobilize a polyclonal anti-human 
5 F(ab') 2 on the sensor chip and use this to capture 

the individual Fabs. An Fab of this invention was 
injected and captured by the polyclonal ant i -human 
F{ab , ) 2 . The captured Fab was then used to bind 
gpl20 derived from the HIV-1 strain LAI. The 

10 captured Fab would thus bind the gpl20 at its 

respective epitope. A second Fab of this invention 
was then injected. A response in the BIAcore assay 
after injection of the second Fab indicates that 
binding has occurred. If the second Fab injected 

15 recognizes the same or similar epitope on the gpl20 

as the first Fab, no response would occur. No 
response would therefore indicate that the two Fabs 
tested in the assay competed for binding to the 
same or similar epitope on gpl20. Alternatively, a 

20 response in the assay suggests that the epitopes 

recognized by the two Fabs are distinct from one 
another and that binding of the second Fab to gp!20 
to a second epitope is possible in the presence of 
the first Fab. A response would therefore indicate 

25 that the two Fabs tested in the assay did not 

compete for binding to the same or similar epitope. 

The precise epitope similarity determination 
with the BIAcore was performed as follows. A flow 
rate of 5 jxl/min of PBS, pH 7.4 was established and 

30 the biosensor chip was activated by injecting 3 0 pi 

of activation solution (Pharmacia Biosensor, 50% 
0.2 M N-ethyl-N' - ( e- diethyl aminopropyl ) - 
carbodiimide, 50% N-hydroxysuccinimide) . The flow 
rate was then adjusted to 10 /xl/min and the antigen 

35 was injected in 10 mM sodium acetate buffer, pH 

4.5. Forty fil of goat anti-human F(ab') 2 (Pierce) 
at a concentration of 40 ftg/ml in 10 mM sodium 
acetate buffer, pH 4.5 was injected to give a final 
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immobilization of 10000 Response Units (RU) . The 
chip was then blocked from any further 
immobilization by injecting 30 /il of 1 M 
ethanolamine, pH 8.5 (Pharmacia Biosensor). The 
flow rate was adjusted to 1 jzl/min and 4 /il of the 
first Fab at a concentration of 100 ptg/ml was 
injected, immediately followed by 4 /*1 of an anti- 
cytomegalovirus Fab at a concentration of 150 fig/ml 
to block any remaining binding sites on the 
immobilized goat anti-human F(ab') 2 . Next, 4 jxl of 
gpl20 at a concentration of 10 /ig/ml was injected 
followed by 4 jxl of the second Fab at 100 /zg/ml. 
The assay was performed with a combination of all 
of the Fabs to give a mosaic of binding patterns. 
The entire surface was regenerated with 25 /il of 60 
mM HC1 so that the next cycle could be performed. 

Table 8b indicates the results of the epitope 
similarity determination by BIAcore. Table 8a 
shows the positive and negative controls for the 
clones used. The positive controls are the RU 
levels obtained when the first Fab used is the 
clone indicated and the second Fab is an anti-gpl20 
V3-loop Fab. The Fabs of this invention compete 
with soluble CD4 for binding to gpl20. The second 
Fab, an anti-gpl20 V3-loop Fab, neither competes 
with soluble CD4 nor competes with anti-CD4 site 
Fabs and therefore would react with a different 
epitope than the Fabs of this invention. As can be 
seen from the table, all positive controls result 
in significant values of 125 or more, indicating 
the validity of the technique to distinguish 
between non-identical epitopes. The negative 
controls are the values obtained when the same Fab 
is injected twice. This gives the background 
values for each Fab. These values were subtracted 
from all subsequent experiments in order to give 
true values. 
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An epitope map, Table 8b, was then 
constructed. ND indicates that this combination of 
Fabs was not performed. It can be seen from this 
map that Fabs b3, b6, bll, and bl4 form a set which 
compete highly effectively with one another for 
binding to a similar or the same epitope. For the 
most part, a member of the set competes for binding 
as well with another member as it does with itself 
(RU o 0) . On the other hand, bl2 and bl3 appear 
somewhat different in that while they compete for 
binding with members of the above set, they do not 
compete as effectively as the other Fabs within the 
set. Further, competition for binding to the same 
or similar epitope between bl2 and bl3 is 
incomplete. This suggests that the epitopes of 
Fabs bl2 and bl3 are sufficiently dissimilar from 
those of the other four and from each other, to 
allow detectable binding when they are used in 
combination with any of the other Fabs. It may 
therefore be concluded that clones b3, b6, bll, and 
bl4 bind the same or similar epitopes, with Fabs 
bl2 and bl3 bind to epitopes which can be 
distinguished from the other epitopes in this 
assay. 

Table 8a 

Fab b3 b6 bll bl2 b!3 bl4 

POSITIVE 

CONTROL (RU) 129 128 131 125 135 134 
NEGATIVE 

CONTROL (RU) 24 3 8 ND 17 15 ND 

ND indicates that this combination of Fabs was not 
performed. 
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Table 8b 
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Fab 1 





bl3 


b!2 


b£ 


b3 


bl4 


30 


24 


14 


0 


bll 


54 


26 


14 


0 


b3 


26 


29 


0 


0 


b6 


21 


17 


0 


ND 


bl2 


22 


0 


ND 


ND 



ND indicates that this combination of Fabs was not 
performed. 

c. Comparison of Fab Epitopes with Wild-type 
anfl Mutant Forms of gp!20 Usjpg ELISA 
with OP120 in the Solid Phase 
Epitope similarity determinations of the 
panel of Fabs was performed with a panel of HXBc2 
gpl20 mutants of the HIV-1 strain LAI. Conserved 
residues of gpl20 were altered to generate the 
HXBc2 gpl20 mutants. The interaction between the 
mutants and Fabs was investigated to examine 
binding specificity differences between the Fabs at 
greater resolution. The HXBc2 gpl20 mutants used 
in this assay had been previously characterized 
with respect to gpl60 precursor processing, gpl20- 
gp41 association, and CD4 binding ability 
(Olshevsky et al . , J. Virol. . 64: 5701-5707 
(1990)) . Both wild type and mutant gpl20s were 
tested for their ability to bind a saturating 
concentration of each Fab. 

The epitope determination with wild-type and 
mutant gpl20 was performed with HIV-1 envelope 
glycoproteins from culture supernatants of COS-1 
cells transfected with plasmids expressing either 
wild-type or mutant gpl20 from the HXBc2 clone. 
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Microtiter wells were coated with the antibody 
D7324 (Aalto BioReagents ; Dublin, Ireland) which 
binds to the conserved 15 amino acid sequence at 
the carboxy terminus of gpl20. The wild-type or 
mutant gpl20 were thus captured onto the surface of 
microtiter wells by binding to the D7324 antibody. 
A reference HIV-1 positive human serum pool at a 
1:3000 dilution in 0.5% Tween 20 was assayed for 
binding to the wild- type and mutant gpl20s by 
incubating the serum pool with the immobilized 
gpl20. The bound antibody was detected by a second 
enzyme conjugated antibody. The reading obtained 
with the HIV-1 positive human serum pool, N=4, was 
used as the reference value for each mutant. The 
Fabs of this invention were then assessed for 
binding to the wild- type and mutant gpl2 0s and the 
ratio of the Fab to reference serum was determined 
for each gpl20 mutant (Table 9) . The average ratio 
for the entire panel of Fabs was calculated and any 
individual ratio deviating from the mean by less 
than 0.5 times was considered to indicate a gpl20 
amino acid change that decreased Fab recognition, 
while those deviating by more than 2.0 times 
indicated an amino acid change that enhanced Fab 
recognition. In this way, a map of mutations 
affecting the binding of the Fab to gpl20 was 
obtained for each clone essentially as previously 
described (Helseth et al . , J. Virol. . 65:2119-2123 
(1991) and Olshevsky et al . , supra ) . 

Table 9 

Mutation Fab 

B3 B6 Bll B12 B13 B14 



45 W/S 
113 D/A 



1.60 0.61 0.50 0.68 1.20 0.28 
1.46 1.73 1.89 1.13 0.99 0.00 
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Mutation 
113 D/R 
NO VI /V2 
NO V1/V2/V3 
NO V3 

183/184 PI/SG 
207 K/W 
252 R/W 

256 S/Y 

257 T/R 

257 T/A 

257 T/G 

262 N/T 

269 E/L 

314 G/W 
356 N/I 
368 D/R 
368 D/T 
370 E/R 
370 E/Q 
384 Y/E 
386 N/Q 
395 W/S 
427 W/S 
435 Y/S 
450 T/N 
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1.40 1.50 1.61 

1.07 1.48 1.42 

2.05 1.48 1.94 
1.88 1.64 1.92 
0.82 0.73 0.69 
1.15 1.57 1.19 
1.58 1.52 1.58 
0.64 0.14 0.33 
0.08 0.59 0.00 
0.86 0.93 0.75 
0.91 0.70 1.14 

1.06 0.64 1.19 
0.73 0.48 0.45 
0.59 0.36 0.39 
0.67 0.66 0.39 
0.19 0.18 0.00 
0.28 0.20 0.00 
0.01 0.25 0.17 
0.25 0.89 0.58 
1.21 1.02 1.11 
0.88 0.59 0.31 
0.92 0.59 0.47 
1.57 1.11 1.53 
1.93 1.16 1.58 
0.62 0.48 0.58 
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Fab 

0.67 0.71 0.00 

0.23 0.86 1.68 

0.47 0.95 1.60 

0.46 1.08 1.72 

0.33 0.92 0.32 

2.54 1.30 1.36 

1.65 1.39 2.04 

0.82 1.15 0.00 

0.76 0.22 0.00 

0.99 0.68 0.40 

0.74 0.75 0.00 

0.62 0.72 0.24 

0.78 0.83 0.20 

0.65 0.71 0.28 

0.92 0.80 0.52 

0.04 0.00 0.00 

0.03 0.02 0.00 

0.07 0.00 0.00 

0.46 0.14 0.00 

0.25 0.02 0.88 

1.05 0.01 0.36 

1.00 1.05 0.12 

0.63 0.98 0.00 

1.41 1 .24 2 . 04 

0.75 0.75 0.60 
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Mutation Fab 



457 


D/A 


0. 


.62 


0, 


.39 


0 


.44 


0 


.28 


0, 


.62 


0, 


.20 


457 


D/R 


0. 


.84 


0, 


.55 


0, 


.92 


0 


.32 


0, 


.58 


0, 


.56 


470 


P/L 


0. 


.80 


0, 


.64 


0, 


.72 


0 


.72 


0, 


.18 


0, 


.24 


475 


M/S 


0. 


.06 


1. 


.02 


0, 


.33 


1 , 


.50 


1. 


.39 


0. 


.92 


477 


D/V 


0. 


.50 


0. 


.09 


0. 


.00 


0. 


.07 


0. 


.52 


0. 


.00 



The general patterns observed are broadly- 
similar to many CD4 site antibodies and of soluble 
CD 4 . Fab bl2 is distinguished by its decreased 
binding to a mutant in which the VI and V2 loops 
are deleted. This may or may not be related to the 
enhanced neutralizing ability of Fab bl2. However, 
it is clear that the VI and V2 loops and the V3 
loop can affect antibody binding to the CD4 binding 
site either by direct contact or by transmitted 
conformational effects . 

Sensitivity to certain mutations in residues, 
particularly towards the C-terminus of gpl20, has 
previously been associated with CD4 binding site 
antibodies (Thali et al., J. Virol. . 66:5636-5641 
(1992) and Thali et al,, J. Virol. , 65:6188-6193 
( (1991) ) . These mutations include residue 257 
mutated from threonine to arginine (257 T/R) , 368 
D/R, 370 E/R, 457 D/A and 477 D/V. Most of these 
mutations abrogate Fab binding or reduce it to low 
levels consistent with the assignment of the 
recombinant Fabs in this assay as reacting with the 
CD4 site. 

In a particular mutant of gpl20, the V1/V2 
loop (residues 119-205) is completely removed. 
This mutation enhances the binding of Fabs b6, bll, 
and bl4 but significantly decreases the binding of 
Fab bl2 . Deletion of the V3 loop produces a more 
modest decrease in Fab bl2 binding while generally 
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enhancing the binding of the other Fabs. The 314 
G/W change in the V3 loop produces a decrease in 
binding of all the Fabs. This effect has been 
observed for other CD4 binding site antibodies 
(Moore and Sodroski, unpublished observations) . 

When the binding specificities of each Fab is 
examined in detail, each Fab has a unique mutant 
binding profile. For example, Fab bl4 binding is 
eliminated by the 113 D/A change whereas the 
binding of the other Fabs is unchanged or enhanced; 
Fab b3 and bll binding is reduced by the 475 M/S 
mutation but binding by the other Fabs is unchanged 
and the 370 E/Q change reduces binding of all the 
Fabs except for bS and possibly bll. Fab bl2 is 
distinguished by its decreased binding to a mutant 
in which the VI and V2 loops are deleted. This may 
or may not be related to the enhanced neutralizing 
ability of Fab bl2 and will be the subject of 
further study. However, it is clear that the VI 
and V2 loops and the V3 loop can affect antibody 
binding to the CD4 binding site either by direct 
contact or transmitted conformational effects. 

The effects on Fab binding of a series of 
point mutations in gpl20 afford the opportunity to 
look more closely at recognition differences. The 
general patterns observed are broadly reminiscent 
of many CD4 site antibodies and of soluble CD4 
itself. Fab bl2 is distinguished by its decreased 
binding to a mutant in which the VI and V2 loops 
are deleted. This may or may not be related to the 
enhanced neutralizing ability of Fab bl2 . It will 
be necessary to study a number of variants of Fab 
bl2, which could be produced by chain shuffling or 
mutation, to answer this question. However, it is 
clear that the VI and V2 loops and the V3 loop can 
affect antibody binding to the CD 4 binding site 
either by direct contact or transmitted 
conformational effects . 
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6 . Determination of the Relationship Between the 
Epitopes Recognized by the Fabs with HIV-1 
Antigen Multimeric Complexes 

a. Comparison of Fab Epitopes with gpl20 and 
gpl60 Expressed as Multimeric Complexes 
on the Surface of COS-1 Cells 
Given the lack of correlation of Fab 
neutralization with binding parameters assessed 
using recombinant gpl20, the binding of Fabs b3 , 
b6, and bl2 to COS-1 cells expressing the HXBc2 
envelope glycoproteins gpl60 and gp!20 was 
compared. Fab b3, the poorest neutralizer, Fab b6, 
also a poor neutralizer, and Fab bl2, the most 
effective neutralizer as determined in Example 5 
were used in the assay. The envelope glycoproteins 
expressed by the COS-1 cells were gpl60, the 
precursor of gpl20 and gp41, and the mature gpl20. 
In this assay, different concentrations of Fab were 
incubated with radiolabeled COS-1 cells which 
express gpl60 and gp!20 on their surface. The 
cells were then washed and lysed. The gpl20 and 
gpl60 envelope glycoproteins bound to Fab were 
precipitated with goat anti-F(ab') 2 antibody and 
analyzed by protein gel electrophoresis and shown 
in Figure 20. Since the amount of HIV-1 envelope 
glycoprotein expressed on the surface of 
transfected COS-1 cells is small compared with the 
amount present intracellular ly, after cell lysis, 
the bound Fab is presented with a large excess of 
both mature gpl20 and gpl60 precursor forms. The 
total amount of envelope glycoproteins precipitated 
thus provides an indication of the amount of Fab 
bound to the cell surface. Scanning densitometry 
profiles were derived from the autoradiographs and 
are expressed in arbitrary densitometric units. 

Although the lack of saturation for Fabs b6 
and b3 precludes a precise estimate of affinity, it 
is clear that Fab b3 exhibits a lower affinity for 
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the precursor gpl60 than either Fab b6 or bl2 . 
When the binding of Fab bl2 and b6 are compared, 
several differences are apparent. Assuming that 
Fab 6 achieves saturation at concentrations 
slightly higher than 150 /ig/ml, the estimated 
affinities of Fab bl2 and b6 for the total 
population of envelope glycoproteins recognized 
differ only marginally. The most striking 
difference in the binding of Fab bl2 and b6 to the 
multimeric envelope glycoprotein complex is the 
preferential detection of gp!20 relative to gpl60 
by Fab bl2. Using densitometry to estimate 
amounts, it is seen from Figure 20 that Fab bl2 
immunoreacts with an amount of gpl20 that is at 
least about 50 % more than the gpl60 present in the 
immunoreaction admixture. The estimated 
affinities, based on the Fab concentrations at 
which half -maximal binding to gpl20 is observed, 
are 3 x 10 7 NT 1 and <6 x 10 6 W 1 for Fabs bl2 and b6, 
respectively. 

The binding of the Fabs to the multimeric 
envelope glycoprotein complex on the transfected 
COS-1 cell surface provides some insights into the 
observed differences in neutralization potency. 
The binding of the most potent neutralizing Fab, 
Fab bl2, achieves saturation at roughly 100 fig/ml, 
whereas neither of the less potent neutralizing 
Fabs achieves saturation even at 150 jig/ml. Fab b3 
clearly exhibits a lower affinity for the cell 
surface envelope glycoprotein complex than do the 
other two Fabs tested, bl2 and b6. The most 
striking difference in the binding of bl2 and b6 to 
the multimeric envelope glycoprotein complex is the 
preferential precipitation of gpl20 relative to 
gpl60 by the bound Fab bl2 . In addition to these 
differences in gpl20 recognition, it appears that 
the overall number of cell surface envelope 
glycoproteins capable of being recognized by the 
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less neutralizing Fabs is greater than that seen 
for Fab bl2. These differences suggest that Fab 
bl2 may recognize a more limited subset of envelope 
glycoprotein conformations and that these 
conformations are better approximated by the mature 
gpl20 glycoprotein in the cell lysates . It is 
known that the gpl60 precursor assumes a greater 
variety of conformations during the maturation 
process than does the fully folded gpl20 product 
(Thiriart, et al . , J. Immunol. , 143:1832-1836 
(1989) and Fennie and Lasky, J. Virol . . 63:639-646 
(1989)). The enhanced neutralization ability of 
Fab bl2 could reflect a higher affinity for a 
restricted gp!20 conformation present in the 
functionally relevant subset of envelope 
glycoprotein spikes. Such a functionally relevant 
group of envelope glycoproteins moieties probably 
represents a small subset of the total population, 
consistent with the low infectious fraction 
associated with HIV-1 and other retroviral virus 
preparations. One caveat to these observations is 
that the glycosylation of gpl2 0 expressed as a 
recombinant protein in baculovirus or on the 
surface of COS-1 cells is likely to differ and this 
could affect binding of the Fabs of this invention. 
However, no difference in the affinity for CD4 
binding site antibodies between the two forms of 
gpl20 has been observed previously using a range of 
antibodies (Moore and Sodroski, unpublished 
observations) . In addition, these studies employed 
a molecular clone of HIV-1 and its extension to 
primary isolates will need to be studied further. 

Fabs derived from combinatorial libraries may 
be viewed as "artificial". However, as shown here, 
the recognition properties of a set of antibodies 
directed to the CD4 site of gpl20 show many 
features in common with those derived by 
conventional means. They also show many features 
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in common with one another suggesting that, with 
the caveats inherent in the library approach 
(Barbas et al., J. Molec. Biol. . 230:812-823 (1993) 
and Burton and Barbas, Nature, 359:782-783 (1992)), 
5 one individual produces several clearly distinct 

antibodies directed to a common structural feature, 
i.e., the CD4 binding site. This is in agreement 
with observations made on anti-CD4 binding site 
antibodies using anti-idiotype antibodies (Chamat 

10 et al., J. Immunol. , 149:649-654 (1992) and 

Hariharan et al., J. Virol. , 67:953-960 (1993)). 
One advantage of producing several antibodies is 
that escape (at least in binding terms) is made 
more difficult. The only mutations in Table 9 

15 which essentially eliminate the binding of all the 

antibodies also reduce CD4 binding ability. 

The observations presented here have 
significance for vaccine development. The most 
effective vaccine may need to induce antibodies to 

20 the CD4 binding site with properties similar to 

those of Fab bl2. Given the data above, 
recombinant gpl20 offers no special qualities in 
this regard. Further, the Fab bl2 type of antibody 
formed only about 10% (4/33 Fabs) of the cloned 

25 response of the library donor (Barbas et al . , J\. 

Molec. Biol . , 230:812-823 (1993)) and has not been 
described amongst the human antibodies derived by 
other means suggesting it may be a minor component 
of typical responses. It is clearly of some 

30 interest for vaccine design to define more 

precisely the structure recognized by Fab bl2 . 

7 . Recognition of op12Q from Primary HIV-1 
Isolates bv b!2 IaGl in Vitro 
35 The ability of the bl2 IgGl to recognize the 

gpl20 molecule from HIV-1 virus from 69 primary 
isolates was determined in an EL ISA assay. 
Recognition of the primary HIV-1 virus isolate with 
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bl2 IgGl is indicative of the prevalence of the bl2 
epitope in the HIV-1 pandemic. To probe the 
occurrence of the bl2 epitope in the HIV-1 
pandemic, binding of the bl2 IgGl to gpl20 from 69 
international isolates belonging to 6 different 
clades was examined. Virus isolates assayed were 
obtained from the WHO, HMJFAMM, and NIAID. 

Infectious culture supernatants containing 
virus and free gpl20 were treated with l%(v/v) 
Nonidet-P40 (NP40) non-ionic detergent to provide a 
source of gpl20 (Moore et al . , AIDS . 3:155-160 
(1989)). Microplate wells (Immulon II, Dynatech, 
Ltd.) were first coated with sheep polyclonal 
antibody D7324. This antibody was raised to the 
peptide APTKAKRRWQREKR , derived from the C- 
terminal 15 amino acids of the clade B 1 1 IB HIV-1 
viral isolate. Next, an appropriate volume of 
inactivated supernatant containing gpl20 was 
diluted with a buffer comprising tris-buf f ered 
saline (TBS)/1% (v/v) NP40/10% fetal calf serum 
(FCS) and a 100 fxl aliquot added to the microplate 
wells for 2 hours at room temperature. Unbound 
gpl20 was removed by washing with TBS, and bound 
gpl20 was detected with CD4-IgG (1 /ig/ml) or with 
bl2 IgGl diluted in a buffer comprising TBS/2% (w/v) 
nonfat dry milk powder/20% (v/v) sheet serum (TMTSS) 
essentially as previously described (Moore et al., 
AIDS , 4:307-310 (1990)) and Moore et al . , i_ 
Virol. , 68:469-473 (1994)). CD4-IgG is a fusion 
molecule which consists of CD4 and IgG. The CD4 
portion binds to gpl20 and the IgG portion provides 
the means for detection of the CD4-IgG fusion 
molecule with labeled anti-IgG reagents. Bound 
antibody was then detected with an appropriate 
alkaline-phosphatase conjugated anti-IgG, followed 
by AMPAK (Dako Diagnostics) . Absorbance was 
determined at 492 nm (OD 492 ) . Each virus was tested 
against CD4-IgG in triplicate and against b!2 IgGl 



in duplicate. All OD 492 values were corrected for 
non-specific antibody binding in the absence of 
added gpl20 (buffer blank) . The mean, blank- 
corrected OD 492 values for CD4-IgG and bl2 IgGl were 
then calculated, and the OD 492 ratios of bl2 
IgGl:CD4-IgG were determined. This normalization 
procedure enables allowance to be made for the 
different amounts of gpl20 captured onto the solid 
phase via antibody D7324 when comparing antibody 
reactivity with a panel of viruses. Binding ratios 
of 0.50 or greater were deemed to represent strong 
antibody reactivity; ratios from 0.25-0.50 were 
considered indicative of moderate reactivity; 
values of <0.25 were designated as representative 
of essentially negative monoclonal antibody 
reactivity. 

As shown in Figure 22, bl2 IgGl reacts with 
*50% of clades A-D but only 1 of 12 isolates from 
clade E. Reactivity with clade B isolates from the 
U.S.A. is approximately 75%. 

8 . Nucleic Acid Sequence Analysis Comparison 
Between HIV-* Specie Monoclonal Antibody 
Fabs and the Corresponding Derived Amino Acid 
ReQiflue Sequence 

To explore the relationship between 
neutralizing and weakly or non-neutralizing Fabs, 
the variable domains of 32 clones expressing human 
anti-gpl20 Fabs, prepared in Example 2 including 
the 20 listed in Figure 6 for which neutralizing 
activity was assessed, were sequenced. In 
addition, the five gp41-specif ic Fabs were also 
sequenced. 

Nucleic acid sequencing was performed on 
double-stranded DNA using Sequenase 1.0 (USB, 
Cleveland, OH) and the appropriate primers 
hybridizing to sequences in the Cgl domain (SEQGb : 
5* GTCGTTGACCAGGCAGCCCAG 3' SEQ ID NO 49) or the Ck 
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domain (SEQKb : 5* ATAGAAGTTGTTCAGCAGGCA 3' SEQ ID 
NO 50) . Alternatively sequencing employed single 
stranded DNA and the T3 primer (5 1 
ATTAACCCTCACTAAAG 3', SEQ ID NO 51) or one 
hybridizing to a sequence in the Ck domain (KEF : 
5' GAATTCTAAACTAGCTAGTTCG 3' SEQ ID NO 52) . 

The amino acid residue sequences of the 
variable heavy and light chains derived from the 
nucleic acid sequences of the 32 gpl20-specif ic 
clones are shown respectively in Figures 10 and 11. 
Groupings are made on the basis of similarities in 
heavy chain sequences. Dots indicate identity with 
the first sequence in each section. The SEQ ID NOs 
are listed to the right of the corresponding 
derived heavy and light chain (V H from SEQ ID NO 53- 
81 and V L from SEQ ID NO 82-113) amino acid residue 
sequences in the Figures themselves. 

Alignment of derived sequences with one 
another and with the Genbank database made use of 
the MacVector suite of programs. For analysis of 
heavy chain CDR3 sequences as described by Sanz, J. 
Immunol. , 147:1720-1729 (1991), the most 5' 
nucleotide was considered to be the first 
nucleotide after codon 95 of the H chain variable 
region according to Kabat et al, Sequences of 
Proteins of Immunological Interest, US Dept. of 
Health and Human Services, Washington, DC (1991) . 
The most 3 ' nucleotide was assigned to the last 
unidentified nucleotide before the sequence matched 
with the published germline JH genes. The CDR3 
sequences were analyzed using the DNASTAR software. 
Sequence comparisons were performed with both the 
ALIGN and COMPARE programs in order to determine 
the germline D gene which provided the best 
homology throughout. In a second step, the SEQCOMP 
program was used to find sequence identity of at 
least six nucleotides with either the coding strand 
or the reverse complement of germline D genes. 



The heavy and light chain sequences of the 
gp41-specif ic Fabs are shown in Figures 18 and 19, 
respectively. The amino acid residue sequence of 
the CDR3 heavy chain exhibits the most variation 
between the Fabs than any other region of the 
variable domain. 

a. Organization of Antibodies into Groups 
According to Heavy Chain Sequence 
V H and V L domains of 32 gpl20 clones were 
sequenced and the V H domains compared using 
MacVector software. This analysis immediately 
established that a number of the clones, including 
those selected by panning against different 
antigens, are closely related to one another. The 
exception to this is the Fabs selected by panning 
against the V3 loop peptide which are not related 
to the Fabs selected by panning against the 
gp!20/l60 antigens. Figures 10A and 10B show that 
the V H sequences derived from gpl20/l60 panning can 
be organized into 7 groups. The broad features 
apparent from a comparison of amino acid sequences 
are discussed herein. 

The relatedness of sequences within a group 
varies considerably. For instance, in the group 
beginning with clone number b8 the amino acid 
sequences are very similar. Six clones were 
identical and the remainder showed a maximum of 5 
differences from the predominant sequence (the EQ 
difference due to the 5' primer excluded) . Only 
one clone showed a single difference in the CDR3 
region. The average discrepancy over all the 
sequences in this group from the predominant 
sequence is 1.1 amino acid residues/ variable 
domain. This amount corresponds to the order of 
magnitude of discrepancies which could arise from 
the PCR. Sequencing of constant domains indicated 
a PCR error frequency of about 1 base change per 
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domain. 

In contrast, in the group headed by clone b3, 
no two clones were absolutely identical. The 
average difference from the consensus group 
sequence is 3.3 residues per sequence and 
determination for the CDR3 alone is 1.3. 
Therefore, it seems likely that the heavy chains in 
this group are somatic variants of one another. 

The group headed by clone 1 presents a third 
pattern. Clones bl and bl4 are identical as are 
clones b2 and B2 . However, 23 amino acid 
differences exist between the two sets of clones. 
Clones b24 and B30 are approximately equally well 
differentiated (13-25 differences) from either of 
these two sets of clones or one another. Still the 
CDR3 regions are very similar. A number of 
explanations can be suggested for this pattern: 1) 
all clones in this group originate from the same 
germline gene which has undergone extensive somatic 
mutation, 2) cross -over events have occurred to 
essentially recombine different germline genes with 
the same DJ combination, 3) a "convergent 
evolution" process has led to the selection of 
different germline genes associated with the same 
DJ combination. 

b. Sequences of the V L Domains from the 
opl20 Binders 

The V L sequences of the Fabs were 
organized into the groups defined in Figures 10A 
and 10B are shown in Figures 11A and 11B. 
Immediately apparent was the extensive chain 
promiscuity as evidenced by the pairing of 
different light chains with the same or a very 
similar heavy chain with retention of antigen 
binding capability and indeed, for the most part, . 
antigen affinity as compared with Figures 10A and 
10B. This promiscuity can be explored further by 
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reference to the groups considered above. 

The clone b8 group, in which the heavy chain 
members were identical or very similar, also 
produced 4 light chains which are identical or very 
similar (less than 3 amino acid differences) . 
Therefore a predominant heavy- light chain 
combination can be described for this group. One 
member (clone b8) had the same or very closely 
related V L gene but appeared to use a different Jk 
gene. Two other members (clones B8 and bl8) were 
more distantly related to the major sequence (7-12 
differences) . Two further clones (bl3 and B26) 
used a Vk gene from a different family, Vk3 
compared to Vkl, and therefore were unrelated to 
the major sequence. 

The clone b3 group, suggested to contain 
somatic variants of a single heavy chain, showed 
considerable light chain diversity with no two 
members being closely related to one another. 
Vk3-Jk2 combinations predominated but Vk3-Jk3 and 
Vkl-Jk3 combinations also occurred. 

On the other hand, in the clone bl group 
evidence existed for the heavy chains being more 
choosy about their light chain partner. Thus, 
closely related heavy chains appeared to be paired 
with related light chains. The identical heavy 
chain pairs (bl and bl4; b2 and B2) had very 
similar light chains (2 and 4 amino acid 
differences respectively) whereas the distinct 
heavy chains (b24 and B30) had distinct light 
chains which were unrelated to one another or the 
other group members. The clone 4 group provides 
another example of this phenomenon in that 4 
closely related heavy chains were paired with 3 
closely related light chains (a predominant 
heavy-light chain combination) , except for the 
clone b7 light chain that was distinct. 
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In summary, the heavy chain (V H ) sequences was 
organized into 7 groups where each member of a 
group has an identical or very similar CDR3 region 
with a limited number of differences elsewhere. 
When the light chains (V L ) were constrained into the 
groupings defined by their heavy chain partners, 
considerable light chain sequence variation was 
observed. This phenomenon of chain promiscuity has 
been observed previously and can be appreciated by 
reference to Figures 11A and 11B. Marked 
neutralizing ability was confined to two groups of 
sequences. The first group consisted of Fabs 4, 7, 
12 and 21 which have very similar heavy and light 
chains. The second group consisted of Fabs 13, 8, 
18, 22 and 27. Only Fab 13 showed marked 
neutralizing ability, although the others showed 
some weaker activity. Interestingly in this group 
Fab 13 did have a light chain distinct from the 
other members of the group. 

9 . Shuffling of the Heavy and Light Chain of a 
Single Clone Against the Library 
To further explore possible functional 
heavy- light chain combinations, the heavy chain of 
clone bl2 (also referred to as Fab 12 for the 
corresponding soluble Fab preparation) shown in 
Figures 10A and 10B was recombined with the 
original light chain library prepared in Example 2 
to construct a new library H12-LCn. In addition, 
the bl2 light chain was recombined with the 
original heavy chain library to construct a library 
Hn-L12. These two libraries were taken through 3 
rounds of panning against gpl20 (IIIB) as described 
in Example 2b5) . The Fabs expressed from the 
resultant immunoreactant clones were analyzed as 
described in Example 3 above. Clone bl2 was chosen 
as this Fab neutralized HIV-1 in vitro as shown in 
Example 3 . 



To accomplish the preparation of a shuffled 
library from the Fd gene of clone bl2 with the 
original light chain library, the bl2 heavy chain 
was first subcloned into a tetanus toxoid binding 
clone expressed in pComb2-3. The light chain 
library was then cloned into this construction to 
give a library of 1 x 10 7 members. The subcloning 
step was used to avoid contamination with and 
over-representation of the original light chain. A 
similar procedure was adopted for shuffling of 
heavy chains against the light chain from clone bl2 
to give a library of 3 x 10 6 members. Cloning and 
panning procedures were carried out as described 
above for the original library. 

Eleven light chains which recombined with the 
bl2 heavy chain and bound gpl20 by panning were 
randomly chosen for subsequent competition ELISA 
and sequence analysis. The apparent affinities of 
these shuffled combinations were similar with an 
IC M of approximately 10* 8 to 10* 9 M. The sequences 
were organized where a set of 3 were very similar 
to the original bl2 light chain and the other 8 
showing many differences from the original with 
some sub-grouping possible. 

The sequences of the light chains which bound 
to the bl2 heavy chain clone are shown in Figures 
12A and 12B. The sequences are compared to the 
sequence for the original light chain from clone 
bl2. The light chains are identified by numbers 
which do not correspond to. the original light chain 
clones; the assigned numbers of the newly selected 
clones having new light chains are thus arbitrary. 
The sequences of these light chains are also listed 
in the Sequence Listing from SEQ ID NO 114 to 122 . 
Some light chain sequences are identical. In 
addition to immunoreactivity with gpl20, the new 
Fabs isolated from these shuffled clones were 
tested in the syncytia assay for neutralization of 
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HIV-1 infection as described in Example 3. Four 
shuffled monoclonal Fab antibodies, each having the 
heavy chain from clone bl2 f a known HIV-1 
neutralizing clone, and new light chains designated 
L28, L25, L26 and L22, all exhibited approximately 
60% neutralization in a syncytia assay with 0.4 
/zg/ml purified Fab. This effect was equivalent to 
that obtained with the original clone bl2 heavy and 
light chain pair. Maximum neutralization of 
approximately 80% was obtained with the H12/L28 and 
H12/L25 Fabs at 0.7 /ig/ml which was equivalent to 
that seen with the original clone b!2 heavy and 
light pair. The neutralization resulting from the 
H12/L22 and H12/L26 Fabs plateaued at 60% with Fab 
concentrations of 0.4 fig/ml up to 1.0 fig/ml . Thus, 
in addition to the gpl20 immunoreactive and HIV 
neutralizing Fabs obtained in the original library 
prepared as described in Example 2, by shuffling a 
known neutralizing heavy chain with a library of 
light chains, new HIV-1 neutralizing Fab monoclonal 
antibodies have been obtained. 

Ten heavy chains which recombined with the bl2 
light chain were also randomly chosen. One was 
very similar to the original bl2 heavy chain but 
the others have many differences. Nevertheless, 
the V-D and D-J junctions were essentially 
identical indicating the clones had probably arisen 
from the same rearranged B-cell clone by somatic 
modification. Competition ELISA failed to reveal 
any clear difference in affinity between the 
variants selected from those originally analyzed. 

The sequences of the heavy chains which bound 
to the bl2 light chain clone are shown in Figures 
13A and 13B. The sequences are compared to the 
sequence for the original heavy chain from clone 
bl2 . The heavy chains are identified by numbers 
which do not correspond to the original light chain 
clones; the assigned numbers of the newly selected 
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clones having new heavy chains are thus arbitrary. 
The sequences of these light chains are also listed 
in the Sequence Listing from SEQ ID NO 123 to 132. 
Some light chain sequences are identical. In 
addition to immunoreactivity with gpl20, the new 
clones were tested in the syncytia assay for 
neutralization of HIV-1 infection as described in 
Example 3. Two shuffled monoclonal Fab antibodies, 
each having the light chain from clone bl2 , a known 
HIV-1 neutralizing clone, and new heavy chains 
designated H2 and H14, exhibited approximately 40% 
neutralization in a syncytia assay with 1.0 and 0.5 
/xg/ml purified Fab, respectively. This effect was 
equivalent to that obtained with the original clone 
bl2 heavy and light chain pair at a concentration 
of 2 ^g/ml. Maximum neutralization of 
approximately 50% was obtained with the Fab having 
the new H14 chain at 1.0 jzg /ml compared to 80% 
neutralization with 0.7 /ig/ml with the original 
clone bl2 heavy and light pair. Thus, in addition 
to the gpl20 immunoreactive and HIV neutralizing 
Fabs obtained in the original library prepared as 
described in Example 2, by shuffling a known 
neutralizing light chain with a library of heavy 
chains, new HIV-1 neutralizing Fab monoclonal 
antibodies have been obtained. 

Thus, this shuffling process revealed many 
more heavy and light chain partners that bound to 
gpl20 that were equal in affinity to those obtained 
from the original library prepared in Example 2. 
With this approach, additional HIV-1 neutralizing 
antibodies can easily be obtained over those 
present in an original library. The complexity of 
the clones arising from the heavy chain shuffling 
also suggests that this approach may be used to map 
the course of somatic diversification. 

Combinatorial libraries randomly recombine 
heavy and light chains so to what extent antibodies 
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derived from such libraries represent those 
produced in a response in vivo can be determined. 
In principle, a heavy-light chain combination 
binding antigen could arise fortuitously, i.e., 
neither chain is involved in binding antigen in 
vivo but the combination does bind antigen in 
vitro . 

The available data suggests, however, that 
heavy chains, from immune libraries, involved in 
binding antigen tightly in vitro arise from 
antigen-specific clones in vivo . First, studies 
have generally failed to identify high-affinity 
binders in non-immunized IgG libraries. See, 
Persson et al . Proc. Natl. Acad. Sci., USA , 
88:2432-2436 (1991) and Marks et al . Eur. J, 
Immunol . , 21:985-991 (1991). 

Further, as described above, gpl20 binders 
were not observed in panning a bone marrow IgG 
library from an HIV seronegative donor against 
gpl20. Second, heavy chains associated with 
binders from immunized libraries were typically at 
relatively high frequency in the library indicating 
they were strongly represented in the mRNA isolated 
from immunized animals. See, Caton et al . , Proc. 
Natl. Acad. Sci. . USA . 87:6450-6454 (1990) and 
Persson et al., supra . Third, heavy chains from 
immunized libraries appeared to dictate specificity 
when recombined with various unrelated light chains 
as described in Example 10. Fourth, the isolation 
of intraclonal heavy chain variants as here 
indicated that an active antibody response was 
cloned. Thus, the shuffling of a known heavy chain 
with a light chain binder and vice versa is 
preferred for use in this invention as new 
neutralizing Fabs can be obtained beyond those 
generated in vivo . 

Heavy chain promiscuity, i.e., the ability of 
a heavy chain to pair with different light chains 
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with retention of antigen affinity, presents 
serious problems for identifying in vivo light 
chain partners. This applies not only to the 
strict definition of partners as having arisen from 
the same B-cell but also to one which would 
encompass somatic variants of either partner. The 
existence of predominant heavy- light chain 
combinations, particularly involving intraclonal 
light chain variants, suggests that the light 
chains concerned are well represented in the 
library and probably are associated with antigen 
binding in vivo . However, promiscuity means that, 
although some combinations probably do occur in 
vivo , one cannot be certain that one is not 
shuffling immune partner chains in the 
recombination. For instance, the occurrence of a 
virtually identical light chain (b6, B20) in 2 out 
of 33 clones suggests that it is probably 
over- represented in the library consistent with an 
in vivo involvement in antigen-stimulated clones. 
However, there is no way of knowing whether the in 
vivo partner of the light chain is the b6 or B20 
heavy chain or indeed another heavy chain arising 
from a stimulated clone. 

The light chains arising from the 
combinatorial library may not be those employed in 
vivo . Nevertheless it is interesting to note that 
some heavy chains appear relatively choosy about 
light chain partner whereas others appear almost 
indifferent. This observation needs to be tempered 
by the finding that apparently choosy heavy chains 
from this analysis will accept diverse light chains 
with maintenance of antigen binding in a binary 
plasmid system where pairings are forced as shown 
below in Example 11 rather than selected in a 
competitive situation. 

Two reports compare heavy- light chain 
combinations arising from combinatorial libraries 
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and hybridomas in immunized mice. The library 
approach begins with mRNA and is therefore probably 
reflecting plasma cell populations. In contrast, 
hybridomas are thought to reflect activated but not 
terminally differentiated B cell populations and 
EBV transformation to reflect resting B cell 
populations . 

Whatever the arguments about light chain 
authenticity, the heavy chains of Figures 10A and 
10B present many features of interest. The most 
frequently used heavy chain is of the clone b8 
type. It could be argued that this usage simply 
represents bias in PCR amplification. However, the 
occurrence of approximately equal numbers of clones 
in this group amplified by VHla and VH3a primers 
argues against this notion. Furthermore, the 
existence of intraclonal variants in some groups 
indicates that one is at least sampling different 
genes from the initial library. 

The antibodies cloned here do bear qualitative 
relationship with the polyclonal antibodies present 
in the serum of the asymptomatic donor. The titer 
of anti-gpl20 (IIIB) antibodies was approximately 
1:3000, with greater than 50% of the reactivity 
being inhibited by CD4 or a cocktail of Fabs from 
clones 12, 13 and . 14 . The titer of anti-gp!20 
(SF2) antibodies was approximately 1:800. Further, 
the titer of serum against the short constrained V3 
loop peptide was 1:500 and against the full length 
MN V3 loop peptide was only 1:300. The importance 
of "anti-CD4 site antibodies" seems general in 
donors with longer term HIV infection in that the 
cocktail of Fabs 12, 13 and 14 was able to inhibit 
binding of a large fraction of serum antibody 
reactivity with gpl20 (IIIB) in 26 of 28 donors 
tested. 

The ability of Fabs to neutralize viruses has 
been a controversial area. One of the problems has 



been that Fabs are classically generated by papain 
digestion of IgG. If the Fab, as is often the 
case, shows reduced activity relative to the parent 
IgG then it may be difficult to rule out IgG 
contamination in the Fab preparation. Recombinant 
Fabs, however, as shown herein definitively 
neutralize virus. 

The mechanism of neutralization of HIV-1 
appears to neither require virion aggregation nor 
gp!20 cross -linking. In addition, there is no 
correlation with blocking of the CD4-gpl20 
interaction to neutralization. The existence of 
the cloned neutralizing Fabs of this invention 
should allow the molecular features that confer 
neutralizing potential to be explored. For 
instance, in the case of the group of clones 
containing Fab 13, the unique character of the 
light chain of that neutralizing clone suggests 
that chain shuffling experiments in which the 13 
light chain was recombined with the other heavy 
chains in that group, might be revealing. Heavy 
chains paired with two dissimilar light chains have 
been shown to retain antigen affinity but exhibit 
altered fine specificity as shown in Example 11. 

The observation here of a large number of Fabs 
with only a limited number being strongly 
neutralizing may have important consequences. If 
the pattern is repeated for whole antibodies then 
it would seem that much of the gpl20 structure may 
be in a sense a "decoy", i.e., the immune system 
may invest considerable effort in producing 
antibodies of high affinity but limited ant i -viral 
function. To exacerbate the situation the 
ineffective antibodies may bind to gpl20 and 
inhibit the binding of strongly neutralizing 
antibodies. This has obvious consequences for 
vaccination which should be primarily designed to 
elicit neutralizing antibodies of this invention. 
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10 . Shuffling of Selected Heavy and Light Chain 

DNA Secruences of a Combinatorial Library in a 
Binary Plasmid System 

A binary system of replicon-compatible 
plasmids has been developed to test the potential 
for promiscuous recombination of heavy and light 
chains within sets of human Fab fragments isolated 
from combinatorial antibody libraries. The 
efficiency of the system is demonstrated for the 
combinatorial library of this invention derived 
from the bone marrow library of an asymptomatic HIV 
donor . 

a. Construction of the Binary Plasmid System 
The binary plasmids pTACOlH and pTCOl for 
use in this invention contain the pelB leader 
region and multiple cloning sites from Lambda Hc2 
and Lambda Lc3, respectively, and the set of 
replicon-compatible expression vectors pFL281 and 
pFL261. Both pFL281 and pFL261 have been described 
by Larimer et al., Prot . Eng . , 3:227-231 (1990), 
the disclosure of which is hereby incorporated by 
reference. The nucleotide sequences of pFL261 and 
pFL281 are in the EMBL, GenBank and DDBJ Nucleotide 
Sequence Databases under the accession numbers 
M29363 and M68946. The plasmid pFL281 is based on 
the plasmid pFL260 also described by Larimer et 
al., supra . and having the accession number M29362. 
The only distinction between the plasmids pFL260 
and pFl281 is that pFL281 lacks a 60 bp sequence of 
pFL260 between the Eag I site and the Xma III site 
resulting in the loss of one of the two BamH I 
sites. This deletion is necessary to allow for 
cloning of the BamH I Hc2 fragment into the 
expression vector as described herein. 

The replicon-compatible expression vectors 
share three common elements: (i) the fl 
single-stranded DNA page intergenic IG regions; 



(ii) the tightly regulated tac promoter and lac 
operator; and (iii) an rbs-ATG region with specific 
cloning sites. The plasmid vectors differ in their 
antibiotic resistance markers and plasmid 
replicons: pFL261 carries a gene encoding 
chloramphenicol acetyltransf erase (cat) , conferring 
chloramphenicol resistances nd the pl5A replicon; 
pFL281 carries a gene encoding beta -lactamase 
(bla) , conferring ampicillin resistance, and the 
ColEl replicon (ori) from pMBl . The pl5A and ColEl 
replicons permit the coincident maintenance of both 
plasmids in the same E. coli host . 

The Hc2 and Lc2 vectors prepared in Examples 
la2) and la3) , respectively, were converted into 
the plasmid form using standard methods familiar to 
one of ordinary skill in the art and as described 
by Sambrook et al . , Molecular Cloning: A 
Laboratory Manual, 2nd ed. ( Cold Spring Harbor 
Laboratory Press, New York (1989) and subsequently 
digested with Xho I-Spe I (pHc2) and Sac I-Xba I 
for (pLc2) . The synthetic linkers for insertion 
into the digested pHc2 and Lc2 plasmids were 
prepared by American Synthesis. The linkers were 
inserted to increase the distance between cloning 
sites so as to increase the effectiveness of the 
digestions. The 5* and 3' linkers for preparing 
the double-stranded linker insert into pHc2 were 5 1 
TCGAGGGTCGGTCGGTCTCTAGACGGTCGGTCGGTCA 3' (SEQ ID NO 
133) and 5' CTAGTGACCGACCGACCGTCTAGAGACCGACCGACCC 
3' (SEQ ID NO 134), respectively. The 5' and 3' 
linkers for preparing the double -stranded linker 
insert into pLc2 were 5 1 

CGGTCGGTCGGTCCTCGAGGGTCGGTCGGTCT 3' (SEQ ID NO 135) 
and 5' CTAGAGACCGACCGACCCTCGAGGACCGACCGACCGAGCT 3' 
(SEQ ID NO 136) , respectively. The pairs of linker 
oligonucleotides were separately ligated to their 
respective digested, calf intestinal phosphatase- 
treated vectors. 
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Subsequently, the multiple cloning sites of 
pHc2 and pLc2 were transferred into the expression 
vectors, pFL281 and pFL261, respectively. To 
accomplish this process, the multiple cloning 
regions of both Lc2 and Hc2 were separately 
amplified by PCR as described by Gram et al., Proc . 
Natl. Acad. Sci . , USA , 89:3576-3580 (1992) and as 
described in Example 2b using Vent Polymerase (New 
England Biolabs) according to the manufacturer's 
recommendations. The forward primer, 5 1 
CAAGGAGACAGGATCCATGAAATAC 3' (SEQ ID NO 137) was 
designed to provide a flush fusion of the pelB 
leader sequence to the ribosome binding sites of 
the cloning vectors pFL261 and pFL281 via its 
internal BamH I site indicated by the underlined 
nucleotides. The reverse primer 5' 
AGGGCGAATT GGATCC CGGGCCCCC 3' (SEQ ID NO 138) was 
designed to anneal downstream of the region of 
interest in the parent vector of pHc2/pLc2 and 
create a second BamH I site. The resultant Hc2 and 
Lc2 PCR amplification products were then digested 
with BamH I to provide for BamH I overhangs for 
subsequent ligation into BamH I linearized pFL281 
and pFL261 vectors, respectively. The resulting 
light chain vector containing the Lc2 insert, 
designated pTCOl, was used in this form, whereas 
the heavy chain vector was further modified with a 
histidine tail to allow purification of Fab 
fragments by immobilized metal affinity 
chromatography as described by Skerra et al., 
Bio/Technology , 9:273-278 (1991). For this 
purpose, the synthetic linker oligonucleotides, 
respectively the 5' and 3' linkers, 5 ! 
CTAGTCATCATCATCATCATTAAGCTAGC 3 f (SEQ ID NO 13 9) 
and 5' CTAGGCTAGCTTAATGATGATGATGATGA "3 (SEQ ID NO 
140) was inserted into the Spe I site, in effect 
removing the decapeptide tag sequence to generate 
the heavy chain vector designated as pTACOlH. The 



expression of Fab fragment in all subsequent 
cloning experiments was suppressed by adding 1% 
(w/v) glucose to all media and plates. 

b. Construction of Expression Plasmids 
For expression of the light chain 
variable domain, pTCOl prepared above was first 
digested with Sac I and Xba I; individual light 
chain inserts were then obtained by separately 
digesting 22 of the pComb2-3 plasmids prepared and 
screened as described in Example 2 and listed in 
Figure 7 that bind to gpl20 with the same 
combination of enzymes and isolating the 0.7 kb 
fragment using low melting point agarose gel 
electrophoresis followed by b-agarose digestion. 
For the chain-shuffling experiments, the following 
representative members of each of the seven groups 
shown in Figure 7 were chosen: bll; b6; 
b4-bl2-b7-b21; b3 ; SB; bl-bl4-b24; 
bl3-b22-B26-b8-bl8-b27-B8-B35-s4; and one loop 
peptide -binding clone, p35. The different groups 
are indicated by semicolon separations while 
members of the same group are dashed. The 
resultant isolated light chains were separately 
ligated into PTCOl overnight at 16 °C under standard 
conditions using a 5:1 molar insert-to-vector ratio 
to form 21 light chain pTCOl expression vectors. 
For expression "of the heavy chain variable domain, 
pTACOlH prepared above was first digested with Xho 
I and Spe I; heavy chain inserts were then obtained 
by separately PCR amplification reactions of the 20 
pComb2-3 plasmids from which light chain inserts 
were obtained. PCR was used to isolate the heavy 
chain inserts instead of restriction digestion in 
order to obtain heavy chain without the cpIII gene 
anchor sequence in the vector. For the PCR 
reaction, the respective 5' and 3 ! primers, 5' 
CAGGTGCAGCTCGAGCAGTCTGGG 3* (VHla) (SEQ ID NO 42) 
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and 5' GCATGTACTAGTTTTGTCACAAGATTTGGG 3 1 (CGlz) 
(SEQ ID NO 44) were used to amplify the region 
corresponding to the heavy chain as described in 
Examples 2al) and 2a2) . The resultant PGR products 
were purified by low-melting point electrophoresis, 
digested with Xho I and Spe I, re-purified, and 
separately ligated to the similarly prepared heavy 
chain pTACOlH vector using a 1:2 molar vector-to- 
insert ratio to form 21 heavy chain pTACOlH 
expression vectors. 

c . Co- transformation of Binary Plasmids 
CaCl 2 -competent XLl-Blue cells 
(Stratagene; recAl, endAl, gyrA96, thi, hsdR17, 
supE44, relAl, lac, {F* proAB, lacl q , ZDM15 , 
TnlO(tet R )}) were prepared and transformed with 
approximately 0.5 /*g purified DNA of each plasmid 
in directed crosses of each of the 20 light chain 
vectors with each of the 20 heavy chain vectors. 
The presence of both plasmids and the episome was 
selected for by plating transf ormants on 
triple-antibiotic agar plates {100 fig/ml 
carbenicillin, 30 fig/ml chloramphenicol, 10 fxg/ml 
tetracycline, 32 g/1 LB agar) containing 1% 
glucose . 

A binary plasmid system consisting of two 
replicon-compatible plasmids was constructed as 
shown in 14 . The pTACOlH heavy chain vector 
schematic is shown in Figure 14A and the pTCOl 
light chain vector schematic is shown in Figure 
14B. -Both expression vectors feature similar 
cloning sites including pel B leader sequences 
fused to the ribosome binding sites and the tac 
promoters via BamH I sites as shown in Figures 15A 
and 15B. The nucleotide sequences of the multiple 
cloning sites along with the tac promoter, ribosome 
binding sites (rbs) and the underlined relevant 
restriction sites for the light chain vector, 
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pTCOl, and heavy chain vector, pTACOlH, are 
respectively shown in Figure 15A and Figure 15B. 
The sequences are also listed in the Sequence 
Listing as described in the Brief Description of 
the Drawings. The heavy chain vector pTACOlH also 
contains a (His) 5 - tail to allow purification of the 
recombinant Fab fragments by immobilized metal 
affinity chromatography. The presence of both 
plasmids in the same bacterial cell is selected for 
by the presence of both antibiotics in the media. 
Expression is partially suppressed during growth by 
addition of glucose and induced by the addition of 
IPTG at room temperature. Under these conditions, 
both plasmids are stable within the cell and 
support expression of the Fab fragment as assayed 
by ELISA using goat ant i -human kappa and goat 
anti-human IgGl antibodies. 

d. Preparation of Recombinant Fab Fragments 
Bacterial cultures for determination of 
antigen-binding activity were grown in 96 
well-tissue culture plates (Costar #3596) . 250 j*l 
Superbroth [SB had the following ingredients per 
liter: 10 g 3- (N-morpholino) propanesulf onic acid, 
30 g tryptone, 20 g yeast extract at pH 7.0 at 
25°C) containing 30 fig/ml chloramphenicol, 100 
/xg/ml carbenicillin, and 1% (w/v) ] glucose were 
admixed per well and inoculated with a single 
double -transf ormant prepared in Example 11c above. 
The inoculated plates were then maintained with 
moderate shaking (200 rpm) on a horizontal shaker 
for 7-9 hours at 37°C, until the A^ was 
approximately 1-1.5. The cells were collected by 
centrifugation of the microtiter plate (1,500 X g 
for 30 minutes at 4°C) , the supernatants were 
discarded, and the cells were resuspended and 
induced overnight at room temperature in fresh 
media containing 1 mM IPTG, but no glucose. Cells 
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were harvested by centrif ugation, resuspended in 
175 /zl PBS (10 mM sodium phosphate, 160 mM NaCl at 
pH 7.4 at 25°C) containing 34 tig/ml 
phenylmethylsulfonyl fluoride (PMSF) and 1.5% (w/v) 
streptomycin sulfate, and lysed by 3 freeze- thaw 
cycles between -80°C and 37°C. The resultant crude 
extracts were partially cleared by centrif ugation 
as above before analysis by antigen-binding ELISA. 

e . Assay and Determination of Relative 
Affinities 

Relative affinities were determined as 
described in Example 2b6) after coating wells with 
0.1 /xg of antigen. The selected antigens included 
tetanus toxoid and recombinant gpl20 (strain IIIB) 
and gpl20 (strain SF2) . For each antigen, a 
negative control extract of XLl-Blue cells 
co-transformed with pTCOl and pTACOlH was tested to 
determine whether other components in E. coli had 
any affinity for the antigens in the assay. Each 
extract was assayed for BSA-binding activity and 
BSA-positive clones were considered negative. All 
possible single -transf ormants expressing one chain 
only were prepared as described for the 
double-transf ormants and were found to have no 
affinity for any of the antigens used. Because of 
the nature of the assay, whether this was due to a 
lack of binding by the individual chains itself or 
due to a lack of expression or folding could not be 
determined. 

f . Results of Direct Crosses of Heavy and 
Light Chains within a Set of qpl20/qpl60 
Binding Antibodies 

The Fab fragments derived from the bone 
marrow of the same asymptomatic HIV donor but 
panned against gpl20 (IIIB), gpl60 (IIIB), and 



gpl20 (SF2), were assigned to one of seven groups 
based on the amino acid sequences of the CDR3 of 
their heavy chains as described in Example 9. From 
the same library, antibodies to the constrained 
hypervariable v3- loop- like peptide JSISIGPGRAFYTGZC 
(SEQ ID NO 141) were isolated. For the 
chain-shuffling experiments, the following 
representative members of each of the seven groups 
shown in Figure 7 were chosen: bll; b6; b4-bl2-b7- 
b21; b3; s8; bl-bl4-b24; bl3 -b22 -B26-b8-bl8 -b27-B8 - 
B35-S4; and one loop peptide -binding clone, p35. 
Clones b4, b7, bl2, and b21 showed neutralization 
activity against HIV when monitoring inhibition of 
infection by syncytia formation and clones bl3, 
bl2, and b4 when monitoring p24 production as shown 
in Example 3 . Light and heavy chains were cloned 
from the original constructs and cotransf ormed in 
all possible binary combinations into XLl-Blue 
cells as described above. 

The results of the complete cross are shown in 
Figure 16. As is to be expected, identical chains 
derived from different Fab fragments had similar 
binding properties e.g., bl8HC, b27HC, B8HC, B35HC, 
s4HC. The crosses of the original heavy chains 
with the original light chains in each case clearly 
recapitulated binding activity. Minor differences 
existed between some heavy chains with identical 
variable domain* sequences, e.g., b4 and bl2 
(constant domains were not sequenced for any of the 
constructs) . The exception is b8HC, which was 
identical in its variable domain to blSHC, b27HC, 
B8HC, B35HC, s4HC, yet shows more cross reactivity. 
Presumably, this is due to differences in 
expression levels in the cell or differences in the 
constant domain sequences. Clear differences 
existed between heavy chains in their tendency to 
accept different light chains and still bind 
antigen, but even the least promiscuous heavy chain 
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in the set panned against gp!20 (IIIB) , blHC, still 
did so in 43% of its crosses. On the other side of 
the spectrum, 5 heavy chains, bllHC, b6HC, bl2HC, 
b7HC, and b8HC, crossed productively with all light 
chains in this set. For the heavy chain crosses 
examined in detail (all of S4HC, B35HC, B26HC; most 
of bl2HC, bl2HC) , no significant differences in 
apparent binding affinity were found between Fab 
fragments using the same heavy chain but different 
light chains as shown in Figure 17 where the IC 50 
from competition with soluble gpl20 (IIIB) was 
approximately 10* a M. 

Within the original seven groups that were 
established according to the sequence of the CDR3 
of the heavy chains and that are indicated by 
horizontal and vertical lines in Figure 16, 
complete promiscuity was present, i.e., heavy and 
light chains within these CDR3 -determined groups 
were completely promiscuous with each other. 
However, there was a lack of promiscuity between 
other groups, e.g., between blHC-b24HC and 
bl3LC-s4LC. In the analysis of these 
sequence-based groups, the protein antigen against 
which the phage display library was panned was not 
a critical factor. The exception to this case was 
the cross of p35HC with all light chains; the only 
cross that bound either to gpl20 (SF2 strain) or 
the original antigen, the loop peptide, was the 
cross containing the original heavy and light 
chains . 

Unlike the heavy chains, no light chains 
crossed productively with all heavy chains nor were 
any distinguishable from the other light chains by 
unusually low promiscuity. 

In the neutralization assays performed as 
described in Example 3, the directed cross 
resulting from the pairing of the heavy chain from 
clone bl2 with the light chain from clone b21 f was 
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effective at neutralizing HIV-1. 

9- Interantiaenic Crosses of Heavy and Light 
Chains 

To determine whether conclusions derived 
from the crosses between high affinity Fab 
fragments originating from the same library can be 
extended to unrelated libraries, a non-related 
gammalk-Fab fragment (P3-13) specific for tetanus 
toxoid from a different donor was chosen for a new 
set of crosses [clone 3 in Persson et al . , Proc ■ 
Natl. Acad. Sci . . USA . 88:2432-2436 (1991)]. 
Extracts were probed with tetanus toxoid or with 
gpl20 (IIIB) . The data confirm the results from 
the gpl20 cross experiment in that the binding 
activity towards the antigen was determined by the 
heavy chain. The heavy chain of clone P3-13 paired 
with the light chains b4, bl2, b21, and bl4 to 
yield an Fab fragment with an affinity towards 
tetanus toxoid; the light chain of P3-13 paired 
with the heavy chains of b3, b6, bll, and bl4 to 
yield an Fab fragment with an affinity towards 
gpl20 (IIIB) • None of the light chains originating 
from the gpl20 binders was able to confer gpl20 
specificity in combination with the P3-13 heavy . 
chain. 

Similarly, the P3-13 light chain was unable to 
generate tetanus toxoid specificity in combination 
with any of the heavy chains originating from the 
gpl20 binders, confirming the dominance of the 
heavy chain in the antibody-antigen interaction. 
Interestingly, all three light chains that showed a 
strong signal against tetanus toxoid (b4, bl2, b21) 
were members of the same group when sorted by the 
CDR3 1 s of their original heavy chains . As might be 
expected from crosses between unrelated libraries, 
not only was there a lower degree of promiscuity, 
i.e., chains paired productively with far fewer 
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complementary chains, but the range of apparent 
affinity constants determined by competition EL ISA 
was much broader (6.3 X 10 6 - 6.3 X 10' 8 M) . The 
replacement of the original P3-13 light chain in 
the P3-13 Fab fragment with another light chain 
lowered the affinity of the Fab towards tetanus 
toxoid 10 to 100-fold (from 6.3 X 10' 8 M to 6.3 X 
10* 6 M) . In the crosses of the light chain of P3-13 
with all the heavy chains of the HIV pannings, the 
productive crosses had similar affinities to gpl20 
(IIIB) (2.5 X 10 7 - 6.3 X 10' 7 M) , with the exception 
of bl4HC/P3-13LC, whose signal was too weak for a 
definite determination of the apparent binding . 
constant. These affinities were approximately 
five-fold lower than those of the gpl20-heavy 
chains with their original light chains. 

Thus, the results show that chain shuffling is 
yet another maneuver allowed in vitro but not in 
vivo which can be expected to help extend antibody 
diversity beyond that of Nature. The overriding 
feature of the binary system of this invention is 
its ability to create large numbers (several 
hundred) of directed crosses between characterized 
light and heavy chains without the need for 
recloning individual chains for each cross after 
the initial vector construction. When used in 
combination with the phage-display method and 
biological assays, it allows the rapid analysis of 
the most interesting subset of the pool of 
antigen-binding clones by chain shuffling, with the 
aim of finding biologically or chemically active 
antibodies. For the set of antigens studied here, 
most heavy chains recombined with a number of light 
chains to yield an antigen-binding Fab fragment. 

These results have important implications for 
the diversity of combinatorial antibody libraries. 
While it is not possible to predict reliably the 
original in vivo combinations of light and heavy 
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chains due to the surprising promiscuity of 
individual chains, recombinant antibody libraries 
take advantage of the fact that even distantly 
related Fabs against the same antigen can recombine 
in vitro to give chain combinations not found in 
vivo . In fact, after the identification of a 
certain number of antibodies that have been shown 
to possess some biological or chemical activity, it 
may be better to shuffle their individual chains in 
a directed fashion than to continue sampling 
randomly from the same pool of binders. By 
extension, the promiscuity observed in this system 
indicates that in libraries constructed using 
degenerate, chemically synthesized 
oligonucleotides, there should be considerable 
flexibility in which separate synthetic heavy 
chains can pair with separate synthetic light 
chains to generate separate antigen-binding Fab 
fragments. The diversity of combinatorial 
libraries coupled with chain-shuffling should allow 
wide exploration of three dimensional space thereby 
solving the problem of how to approximate molecules 
in the ternary complex of antibody, substrate and 
cof actor. 

11 . Deposit of Materials 

The following cell lines have been deposited 
on September 30, 1992, with the American Type 
Culture Collection (ATCC) , 1301 Parklawn Drive, 
Rockville, MD, USA: 

Cell Line ATCC Accession No. 



E- coli 
E. coli 
E. coli 



MT11 



ATCC 69078 



MT12 



ATCC 69079 



MT13 



ATCC 69080 



The deposits listed above, MT11, MT12 and MT13 
are bacterial cells ( E. coli ) containing the 
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expression vector pComb2-3 for the respective 
expression of the Fabs designated bll (clone bll) , 
bl2 (clone bl2) , and bl3 (clone bl3) prepared in 
Example 2b. The sequences of the heavy and light 
chain variable domains are listed in Figures 10A 
and 10B and 11A and 11B, respectively. This 
deposit was made with the ATCC under the provisions 
of the Budapest Treaty on the International 
Recognition of the Deposit of Microorganisms for 
the Purpose of Patent Procedure and the Regulations 
thereunder (Budapest Treaty) . This assures 
maintenance of a viable culture for 30 years from 
the date of deposit. The organisms will be made 
available by ATCC under the terms of the Budapest 
Treaty which assures permanent and unrestricted 
availability of the progeny of the culture to the 
public upon issuance of the pertinent U.S. patent 
or upon laying open to the public of any U.S. or 
foreign patent application, whichever comes first, 
and assures availability of the progeny to one 
determined by the U.S. Commissioner of Patents and 
Trademarks to be entitled thereto according to 35 
U.S.C. §122 and the Commissioners rules pursuant 
thereto (including 37 CFR §1.14 with particular 
reference to 886 OG 638) . The assignee of the 
present application has agreed that if the culture 
deposit should die or be lost or destroyed when 
cultivated under suitable conditions, it will be 
promptly replaced on notification with a viable 
specimen of the same culture. Availability of the 
deposited strain is not to be construed as a 
license to practice the invention in contravention 
of the rights granted under the authority of any 
government in accordance with its patent laws. 

The foregoing written specification is 
considered to be sufficient to enable one skilled 
in the art to practice the invention. The present 
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invention is not to be limited in scope by the cell 
lines deposited, since the deposited embodiment is 
intended as a single illustration of one aspect of 
the invention and any cell lines that are 
functionally equivalent are within the scope of 
this invention. The deposit of material does not 
constitute an admission that the written 
description herein contained is inadequate to 
enable the practice of any aspect of the invention, 
including the best mode thereof, nor is it to be 
construed as limiting the scope of the claims to 
the specific illustration that it represents. 
Indeed, various modifications of the invention in 
addition to those shown and described herein will 
become apparent to those skilled in the art from 
the foregoing description and fall within the scope 
of the appended claims. 
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(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: THE SCRIPPS RESEARCH INSTITUTE 

(B) STREET: 10666 North Torrey Pines Road 

(C) CITY: La Jolla 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) POSTAL CODE (ZIP): 92037 

(G) TELEPHONE: 619-554-2937 

(H) TELEFAX: 619-554-6312 

(il) TITLE OF INVENTION: HUMAN NEUTRALIZING MONOCLONAL ANTIBODIES 
TO HUMAN IMMUNODEFICIENCY VIRUS 

(iii) NUMBER OF SEQUENCES: 170 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS/MS -DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 (EPO) 

(v) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US 95/ 

(B) FILING DATE: ll-JUL-1995 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/276,852 

(B) FILING DATE: 18 -JUL- 1994 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(iv) ANTI-SENSE: NO 
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(xl) SEQUENCE DESCRIPTION: SEQ ID NO:l: 
GGCCGCAAAT TCTATTTCAA GGAGACAGTC ATAATGAAAT ACCTATTGCC TACGGCAGCC 60 
GCTGGATTGT TATTACTCGC TGCCCAACCA GCCATGGCCC AGGTGAAACT GCTCGAGATT 120 
TCTAGACTAG TTACCCGTAC GACGTTCCGG ACTACGGTTC TTAATAGAAT TCG 173 
(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 
TCGACGAATT CTATTAAGAA CCGTAGTCCG GAACGTCGTA CGGGTAACTA GTCTAGAAAT 60 
CTCGAGCAGT TTCACCTGGG CCATGGCTGG TTGGGCAGCG AGTAATAACA ATCCAGCGGC 120 
TGCCGTAGGC AATAGGTATT TCATTATGAC TGTCTCCTTG AAATAGAATT TGC 173 
(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TGAATTCTAA ACTAGTCGCC AACGAGACAG TCATAATGAA ATACCTATTG CCTACGGCAG 60 
CCGCTGGATT GTTATTACTC GCTGCCCAAC CAGCCATGGC CGAGCTCGTC AGTTCTAGAG 120 
TTAAGCGGCC G 131 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 139 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 
TCGACGGCCG CTTAACTCTA GAACTGACGA GCTCGGCCAT GGCTGGTTGG GCAGCGAGTA 60 
ATAACAATCC AGCGGCTGCC GTAGGCAATA GGTATTTCAT TATGACTGTC TCCTTGGCGA 120 
CTAGTTTAGA ATTCAAGCT 139 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ser 
15 10 

(2) INFORMATION FOR SEQ ID NO: 6: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala 
15 10 15 

Ala Gin Pro Ala Met Ala Gin Val Lys Leu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: N~ terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala 
15 10 15 

Ala Gin Pro Ala Met Ala Glu 
20 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 198 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



TGTTGACAAT TAATCATCGG CTCGTATAAT GTGTGGAATT GTGAGCGGAT AACAATTTCA 



60 



CACAGGAGGA AGGATCCATG AAATACCTAT TGCCTACGGC AGCCGCTGGA TTGTTATTAC 120 



(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 198 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND ED NESS : double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
GCGGCCGCTT AACTCTAGAG ACCGACCGAC CCTCGAGGAC CGACCGACCG AGCTCGGCCA 60 
TGGCTGGTTG GGCAGCGAGT AATAACAATC CAGCGGCTGC CGTAGGCAAT AGGTATTTCA 120 
TGGATCCTTC CTCCTGTGTG AAATTGTTAT CCGCTCACAA TTCCACACAT TATACGAGCC 180 
GATGATTAAT TGTCAACA 198 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: N- terminal 



TCGCTGCCCA ACCAGCCATG GCCGAGCTCG GTCGGTCGGT CCTCGAGGGT CGGTCGGTCT 



180 



CTAGAGTTAA GCGGCCGC 



198 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Met Lys Thr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala 
1 5 10 .15 

Ala Gin Pro Ala Met Ala Glu Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 11: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

TGTTGACAAT TAATCATCGG CTCGTATAAT GTGTGGAATT GTGAGCGGAT AACAATTTCA 60 

CACAGGAGGA AGGATCCATG AAATACCTAT TGCCTACGGC AGCCGCTGGA TTGTTATTAC 120 

TCGCTGCCCA ACCAGCCATG GCCCAGGTGA AACTGCTCGA GGGTCGGTCG GTCTCTAGAC 180 

GGTCGGTCGG TCACTAGTCA TCATCATCAT CATTAAGCTA 220 
(2) INFORMATION FOR SEQ ID NO: 12: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
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TAGCTTAATG ATGATGATGA TGACTAGTGA CCGACCGACC GTCTAGAGAC CGACCGACCC 60 

TCGAGCAGTT TCACCTGGGC CATGGCTGGT TGCCCAGCGA GTAATAACAA TCCAGCGGCT 120 

GCCGTAGGCA ATAGGTATTT CATGGATCCT TCCTCCTGTG TGAAATTGTT ATCCGCTCAC 180 

AATTCCACAC ATTATACGAG CCGATGATTA ATTGTCAACA 220 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: N- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Lys Thr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala 
15 10 15 

Ala Gin Pro Ala Met Ala Gin Val Lys Leu Leu Glu 
20 25 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: C- terminal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Thr Ser His His His His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 
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(C) STRAND EDN ESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
GGCCGCAAAT TCTATTTCAA GGAGACAGTC AT 32 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
AATGAAATAC CTATTGCCTA CGGCAGCCGC TGGATT 36 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
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GTTATTACTC GCTGCCCAAC CAGCCATGGC CC 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
CAGTTTCACC TGGGCCATGG CTGGTTGGG 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
CAGCGAGTAA TAACAATCCA GCGGCTGCCG TAGGCAATAG 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

GTATTTCATT ATGACTGTCT CCTTCAAATA GAATTTCC 38 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 40 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
AGCTGAAACT GCTCGAGATT TCTAGACTAG TTACCCGTAC 40 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 
CGGAACGTCG TACGGGTAAC TAGTCTAGAA ATCTCGAG 
(2) INFORMATION FOR SEQ ID NO: 23: 



38 



WO 96/02273 



PCT/US95/08743 



- 192 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(Iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 
GACGTTCCGG ACTACGGTTC TTAATAGAAT TCG 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 
TCGACGAATT CTATTAAGAA CCGTAGTC 
(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
TGAATTCTAA ACTAGTCCCC AAGGAGACAG TCAT 
(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 
AATGAAATAC CTATTGCCTA CGGCAGCCGC TGGATT 
(2) INFORMATION FOR SEQ ID NO:27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
GTTATTACTC GCTGCCCAAC CAGCCATGGC C 
(2) INFORMATION FOR SEQ ID NO: 28: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 
GAGCTCGTCA GTTCTAGAGT TAAGCGGCCG 30 
(2) INFORMATION FOR SEQ ID NO:29: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(il) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GTATTTCATT ATGACTGTCT CCTTGGCGAC TAGTTTAGAA TTCAAGCT 48 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
CAGCGAGTAA TAACAATCCA GCGGCTGCCG TAGGCAATAG 



40 
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(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(ill) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
TGACGAGCTC GCCCATGGCT GGT1GGG 27 
(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
TCGACGGCCG CTTAACTCTA GAAC 24 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 666 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(iii) HYPOTHETICAL: NO 
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(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

CCATTCGTTT GTGAATATCA ACGCCAAGGC CAATCGTCTG ACCTGCCTCA ACCTCCTGTC 60 

AATGCTGGCG GCGGCTCTGG TGGTGGTTCT GGTGGCGGCT CTGAGGGTGG TGGCTCTGAG 120 

GGTGGCGGTT CTGAGGGTGG CGGCTCTGAG GGAGGCGGTT CCGGTGGTGG CTCTGGTTCC 180 

GGTGATTTTC ATTATGAAAA GATGGCAAAC GCTAATAAGG GGGCTATGAC CGAAAATGCC 240 

GATGAAAACG CGCTACAGTC TGACGCTAAA GGCAAACTTG ATTCTGTCGC TACTGATTAC 300 

GGTGCTGCTA TCGATGGTTT CATTGGTGAC GTTTCCGGCC TTGCTAATGG TAATGGTGCT 360 

ACTCGTGATT TTGCTGGCTC TAATTCCCAA ATGGCTCAAC TCGGTGACGG TGATAATTCA 420 

CCTTTAATGA ATAATTTCCG TCAATATTTA CCTTCCCTCC CTCAATCGGT TGAATGTCGC 480 

CCTTTTGTCT TTAGCGCTGG TAAACCATAT GAATTTTCTA TTGATTGTGA CAAAATAAAC 540 

TTATTCGGTG TCTTTGCGTT TCTTTTATAT GTTGCCACCT TTATGTATGT ATTTTCTACG 600 

TTTGCTAACA TACTGCGTAA TAAGGAGTCT TAATCATGCC AGTTCTTTTG GGTATTCCGT 660 

TATTAT 666 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 211 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: unknown 

(ii) MOLECULE TYPE: protein 

(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

Pro Phe Val Cys Glu Tyr Gin Gly Gin Gly Gin Ser Ser Asp Leu Pro 
1 5 10 15 

Gin Pro Pro Val Asn Ala Gly Gly Gly Ser Gly Gly Gly Ser Gly Gly 
20 25 30 
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Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly Ser Glu Gly Gly Gly 
35 40 45 

Ser Glu Gly Gly Gly Ser Gly Gly Gly Ser Gly Ser Gly Asp Phe Asp 
50 55 60 

Tyr Glu Lys Met Ala Asn Ala Asn Lys Gly Ala Met Thr Glu Asn Ala 
65 70 75 80 

Asp Glu Asn Ala Leu Gin Ser Asp Ala Lys Gly Lys Leu Asp Ser Val 
85 90 95 

Ala Thr Asp Tyr Gly Ala Ala lie Asp Gly Phe lie Gly Asp Val Ser 
100 105 110 

Gly Leu Ala Asn Gly Asn Gly Ala Thr Gly Asp Phe Ala Gly Ser Asn 
115 120 125 

Ser Gin Met Ala Gin Val Gly Asp Gly Asp Asn Ser Pro Leu Met Asn 
130 135 140 

Asn Phe Arg Gin Tyr Leu Pro Ser Leu Pro Gin Ser Val Glu Cys Arg 
145 150 155 160 

Pro Phe Val Phe Ser Ala Gly Lys Pro Tyr Glu Phe Ser lie Asp Cys 
165 170 175 

Asp Lys lie Asn Leu Phe Arg Gly Val Phe Ala Phe Leu Leu Tyr Val 
180 185 190 

Ala Thr Phe Met Tyr Val Phe Ser Thr Phe Ala Asn lie Leu Arg Asn 
195 200 205 

Lys Glu Ser 
210 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(iv) ANTI- SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
GAGACGACTA GTGGTGGCGG TGGCTCTCCA TTCGTTTGTG AATATCAA 48 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 
TTACTAGCTA GCATAATAAC GGAATACCCA AAAGAACTGG 40 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
TATGCTAGCT AGTAACACGA CAGGTTTCCC GACTGG 36 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 
.(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE : NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
ACCGAGCTCG AATTCGTAAT CATGGTC 27 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
AGCTGTTGAA TTCGTGAAAT TGTTATCCGC T 31 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 708 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:40: 
GAGACGACTA GTGGTGGCGG TGGCTCTCCA TTCGTTTGTG AATATCAAGG CCAAGGCCAA 



60 
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TCGTCTGACC 


TGCCTCAACC 


TCCTGTCAAT 


GCTGGCGGCG 


GCTCTGGTGG 


TGGTTCTGGT 


120 


GGCGGCTCTG 


AGGGTGGTGG 


CTCTGAGGGT 


GGCGGTTCTG 


AGGGTGGCGG 


CTCTGAGGGA 


180 


GGCGGTTCCG 


GTGGTGGCTC 


TGGTTCCGGT 


GATTTTGATT 


ATGAAAAGAT 


GGCAAACGCT 


240 


AATAAGGGGG 


CTATGACCGA 


AAATGCCGAT 


GAAAACGCGC 


TACAGTCTGA 


CGCTAAAGGC 


300 


AAACTTGATT 


CTGTCGCTAC 


TGATTACGGT 


GCTGCTATCG 


ATGGTTTCAT 


TGGTGACGTT 


360 


TCCGGCCTTG 


CTAATGGTAA 


TGGTGCTACT 


GGTGATTTTG 


CTGGCTCTAA 


TTCCCAAATG 


420 


GCTCAAGTCG 


GTGACGGTGA 


TAATTCACCT 


TTAATCAATA 


ATTTCCGTCA 


ATATTTACCT 


480 


TCCCTCCCTC 


AATCGGTTGA 


ATGTCGCCCT 


TTTGTCTTTA 


GCGCTGGTAA 


ACCATATGAA 


540 


TTTTCTATTG 


ATTGTGACAA 


AATAAACTTA 


TTCCGTGCTG 


TCTTTGCGTT 


TCTTTTATAT 


600 


GTTGCCACCT 


TTATGTATGT 


ATTTTCTACG 


TTTGCTAACA 


TACTGCGTAA 


TAAGGAGTCT 


660 


TAATCATGCC 


AGTTCTTTTG 


GGTATTCCGT 


TATTATGCTA 


GCTAGTAA 




708 



(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 201 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 

TATGCTAGCT AGTAACACGA CAGGTTTCCC GACTGGAAAG CGGGCAGTGA GCGCAACGCA 60 

ATTAATGTGA GTTAGCTCAC TCATTAGGCA CCCCAGGCTT TACACTTTAT GCTTCCCGCT 120 

CGTATGTTGT GTGGAATTGT GAGCGGATAA CAATTTCACA CAGGAAACAG CTATGACCAT 180 

GATTACGAAT TCGAGCTCGG T 201 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 24 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42: 
CAGCTGCAGC TCGAGCAGTC TGGG 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
GAGGTGCAGC TCGAGGAGTC TGGG 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL; NO 
(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 
GCATGTACTA GTTTTGTCAC AAGATTTGGG 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 
GACATCGAGC TCACCCAGTC TCCA 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
GAAATTGAGC TCACGCAGTC TCCA 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
GCGCCGTCTA GAACTAACAC TCTCCCCTGT TGAAGCTCTT TGTGACGCGC AAG 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: circular 

(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 

Ser lie Ser Gly Pro Gly Arg Ala Phe Tyr Thr Gly 
1 5 io 

(2) INFORMATION FOR SEQ ID N0:49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 

(Ii) MOLECULE TYPE; DNA (genomic) 
(Iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:49: 
GTCCTTGACC AGGCAGCCCA G 
(2) INFORMATION FOR SEQ ID NO: 50: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
ATAGAAGTTG TTCAGCAGGC A 21 
(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 
ATTAACCCTC ACTAAAG 17 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 



(iv) ANTI-SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
GAATTCTAAA CTAGCTAGTT CG 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Leu Glu Glu Ser Gly Thr Glu Phe Lys Pro Pro Gly Ser Ser Val Lys 
1 5 10 15 

Val Ser Cys Lys Ala Ser Gly Gly Thr Phe Gly Asp Tyr Ala Ser Asn 
20 25 30 

Tyr Ala He Ser Trp Val Arg Gin Ala Pro Gly Gin Gly Leu Glu Tyr 
35 40 45 

He Gly Gly He Thr Pro Thr Ser Gly Ser Ala Asp Tyr Ala Gin Lys 
50 55 60 

Phe Gin Gly Arg Val Thr He Ser Ala Asp Arg Phe Thr Pro He Leu 
65 70 75 80 

Tyr Met Glu Leu Arg Ser Leu Arg He Glu Asp Thr Ala He Tyr Tyr 
85 90 95 

Cys Ala Arg Glu Arg Arg Glu Arg Gly Trp Asn Pro Arg Ala Leu Arg 
100 105 110 

Gly Ala Leu Asp Phe Trp Gly Gin Gly Thr Arg Val Phe Val Ser Pro 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Leu Glu Glu Ser Gly Ala Ala Val Gin Lys Pro Gly Ser Ser Val Arg 
15 10 15 

Val Ser Cys Gin Ala Ser Gly Gly Thr Phe Asp Asn Phe Ala Ser Asn 
20 25 30 

Tyr Ala Val Ser Trp Val Arg Gin Ala Pro Gly Gin Gly Leu Glu Trp 
35 40 45 

Met Gly Gly lie Thr Pro Thr Ser Gly Thr Ala Thr Tyr Ser Gin Lys 
50 55 60 

Phe Gin Gly Arg Val Thr He Ser Ala Ala Pro Leu Thr Pro He He 
65 70 75 80 

Tyr Met Glu Leu Arg Ser Leu Arg Asp Asp Asp Thr Ala Val Tyr Tyr 
85 90 95 

Cys Ala Arg Glu Arg Arg Glu Arg Gly Trp Asn Pro Arg Ala Leu Val 
100 105 110 

Gly Ala Leu Asp Val Trp Gly Gin Gly Thr Thr Val 
115 120 

(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

Leu Glu Glu Ser Gly Thr Glu Phe Lys Pro Pro Gly Ser Ser Val Lys 
15 10 15 

Val Ser Cys Lys Ala Ser Gly Gly Thr Phe Gly Asp Tyr Ala Ser Asn 
20 25 30 

Tyr Ala He Ser Trp Val Arg Gin Ala Pro Gly Gin Gly Leu Glu Tyr 
35 40 45 

He Gly Gly He Thr Pro Thr Ser Gly Ser Ala Asp Tyr Ala Gin Lys 
50 55 60 
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Phe Gin Gly Arg Val Thr lie Ser Ala Asp Arg Phe Thr Pro lie Leu 
65 70 75 80 

Tyr Met Glu Leu Arg Ser Leu Arg lie Glu Asp Thr Ala lie Tyr Tyr 
85 90 95 

Cys Ala Arg Glu Arg Arg Glu Arg Gly Trp Asn Pro Arg Ala Leu Arg 
100 105 110 

Gly Ala Leu Asp Phe Trp Gly Gin Gly Thr Arg Val Phe Val Ser Pro 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Leu Glu Glu Ser Gly Ala Glu Val Lys Lys Pro Gly Ser Ser Val Lys 
15 10 15 

Val Ser Cys Lys Ala Ser Gly Gly lie Phe Ser Asp Phe Ala Ser Asn 
20 25 30 

Tyr Ala lie Ser Trp Val Arg Gin Ala Pro Gly Gin Gly Leu Glu Tyr 
35 AO 45 

Met Gly Gly lie Thr Pro Thr Ser Gly Ser Ala Asp Tyr Ala Gin Lys 
50 55 60 

Phe Gin* Gly Arg Val Thr He Ser Ala Asp Ala Ala Thr Pro Arg Val 
65 70 75 80 

Tyr Met Glu Leu Arg He Leu Arg Ser Glu Asp Thr Ala Val Tyr Phe 
85 90 95 

Cys Ala Arg Glu Arg Arg Glu Arg Gly Trp Asn Pro Arg Ala Leu Arg 
100 105 110 

Gly Ala Leu Glu Val Trp Gly Gin Gly Thr Thr Val He Val Ser Pro 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 57: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Leu Glu Glu Ser Gly Ala Ala Val Gin Lys Pro Gly Ser Ser Val Arg 
1 5 10 15 

Val Ser Cys Gin Ala Ser Gly Gly Thr Phe Asp Asn Phe Ala Ser Asn 
20 25 30 

Tyr Ala Val Ser Trp Val Arg Gin Ala Pro Gly Gin Gly Leu Glu Trp 
35 40 45 

Met Gly Gly He Thr Pro Thr Ser Gly Thr Ala Thr Tyr Ser Gin Lys 
50 55 60 

Phe Gin Gly Arg Val Thr He Ser Ala Ala Pro Leu Thr Pro He He 
65 70 75 80 

Tyr Met Glu Leu Arg Ser Leu Arg Asp Asp Asp Thr Ala Val Tyr Tyr 
85 90 95 

Cys Ala Arg Glu Arg Arg Glu Arg Gly Trp Asn Pro Arg Ala Leu Val 
100 105 110 

Gly Ala Leu Asp Val Trp Gly Gin Gly Thr Thr Val He Val Ser Ser 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: 

Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly Ser Ser Val Lys 
15 10 15 
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Val Ser Cys Lys Thr Ser Gly Gly Thr Phe Ser Asp Tyr Ala Ser Asn 
20 25 30 

His Ala He Ser Trp Val Arg Gin Ala Pro Gly Gin Gly Leu Glu Tyr 
35 40 45 

Met Gly Gly He Thr Pro Thr Ser Gly Thr Ala Asp Tyr Ala Gin Lys 
50 55 60 

Phe Gin Ala Arg Val Thr He Ser Ala His Glu Phe Thr Pro He Val 
65 70 75 80 

Tyr Met Glu Leu Arg Ser Leu Arg Ser Asp Gin His Ala Thr Tyr Tyr 
85 90 95 

Cys Ala Thr Glu Arg Arg Glu Arg Gly Trp Asn Pro Arg Ala Leu Arg 
100 105 110 

Gly Ala Leu Asp He Trp Gly Gin Gly Thr Thr Val He Val Ser Ser 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Leu Glu Glu Ser Gly Gly Arg Leu Val Lys Pro Gly Gly Ser Leu Arg 
15 10 15 

Leu Ser Cys Glu Gly Ser Gly Phe Thr Phe Thr Asn Ala Trp Met Thr 
20 25 30 

Trp Val Arg Gin Ser Pro Gly Lys Gly Leu Glu Trp Val Ala Ser He 
35 40 45 

Lys Ser Lys Phe Asp Gly Gly Ser Pro His Tyr Ala Ala Pro Val Glu 
50 55 60 

Gly Arg Phe Ser He Ser Arg Asn Asp Leu Glu Asp Lys Met Phe Leu 
65 70 75 80 

Glu Met Ser Gly Leu Lys Ala Glu Asp Thr Gly Val Tyr Tyr Cys Ala 



115 



120 



125 



85 



90 



95 
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Thr Lys Tyr Pro Arg Tyr Ser Asp Met Val Thr Gly Val Arg Asn His 
100 105 110 

Phe Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val He Val Ser Ser 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Leu Glu Gin Ser Gly Gly Gly Leu Val Lys Pro Gly Gly Ser Leu Arg 
15 10 15 

Leu Ser Cys Glu Gly Ser Gly Phe Thr Phe Thr Asn Ala Trp Met Thr 
20 25 30 

Trp Val Arg Gin Ser Pro Gly Lys Gly Leu Glu Trp Val Ala Ser He 
35 40 45 

Lys Ser Lys Phe Asp Gly Gly Ser Pro His Tyr Ala Ala Pro Val Glu 
50 55 60 

Gly Arg Phe Thr He Ser Arg Asn Asp Leu Glu Asp Lys Leu Phe Leu 
65 70 75 80 

Glu Met Ser Gly Leu Lys Ala Glu Asp Thr Gly Val Tyr Tyr Cys Ala 
85 90 95 

Thr Lys Tyr Pro Arg Tyr Phe Asp Met Met Ala Gly Val Arg Asn His 
100 105 110 

Phe Tyr Met Asp Val Trp Gly Thr Gly Thr Thr Val He Val Ser Ser 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Leu Glu Glu Ser Gly Gly Gly Leu Val Lys Pro Gly Gly Ser Leu Arg 
15 10 15 

Leu Ser Cys Glu Gly Ser Gly Phe Thr Phe Thr Asn Ala Trp Met Thr 
20 25 30 

Trp Val Arg Gin Ser Pro Gly Lys Gly Leu Glu Trp Val Ala Ser He 
35 40 45 

Lys Ser Lys Phe Asp Gly Gly Ser Pro His Tyr Ala Ala Pro Val Glu 
50 55 60 

Gly Arg Phe Thr He Ser Arg Asn Asp Leu Glu Asp Lys Leu Phe Leu 
65 70 75 80 

Glu Met Ser Gly Leu Lys Ala Glu Asp Thr Gly Val Tyr Tyr Cys Ala 
85 90 95 

Thr Lys Tyr Pro Arg Tyr Ser Asp Met Met Ala Gly Val Arg Asn His 
100 105 110 

Leu Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val He Val Ser Ser 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Leu Glu Glu Ser Gly Gly Arg Leu Val Lys Pro Gly Gly Ser Leu Arg 
1 5 10 15 

Leu Ser Cys Glu Ala Ser Gly Phe Thr Phe Thr Asn Ser Trp Met Thr 
20 25 30 

Trp Val Arg Gin Ser Pro Gly Lys Gly Leu Glu Trp Val Ala Ser He 
35 40 45 
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Lys Arg Lys Phe Asp Gly Gly Ser Pro His Tyr Ala Ala Pro Val Glu 
50 55 60 

Gly Arg Phe Ser He Ser Arg Asn Asp Leu Glu Asp Lys Met Phe Leu 
65 * 70 75 80 

Glu Met Ser Gly Leu Lys Ala Glu Asp Thr Gly Val Tyr Tyr Cys Ala 
85 90 95 

Thr Lys Tyr Pro Arg Tyr Ser Asp Met Met Thr Gly Val Arg Asn His 
100 105 110 

Phe Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val He Val Ser Ser 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Leu Glu Glu Ser Gly Gly Gly Leu Val Lys Pro Gly Gly Ser Leu Arg 
15 10 15 

Leu Ser Cys Glu Ser Ser Gly Phe Thr Phe Thr Asn Ala Trp Met Thr 
20 25 30 

Trp Val Arg Gin Ser Pro Gly Lys Gly Leu Glu Trp Val Ala Ser He 
35 40 45 

Lys Ser Lys Phe Asp Gly Gly Ser Pro His Tyr Ala Ala Pro Val Glu 
50 .55 60 

Gly Arg Phe Thr He Ser Arg Asn Asp Leu Glu Asp Lys Leu Phe Leu 
65 70 75 80 

Glu Met Ser Gly Leu Lys Ala Glu Asp Thr Gly Val Tyr Tyr Cys Ala 
85 90 95 

Thr Lys Tyr Pro Arg Tyr Ser Asp Met Met Ala Gly Val Arg Asn His 
100 105 110 

Phe Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val He Val Ser Ser 
115 120 125 
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(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Leu Glu Glu Ser Gly Gly Arg Leu Val Lys Pro Gly Gly Ser Leu Arg 
15 10 15 

Leu Ser Cys Glu Gly Ser Gly Phe Thr Phe Thr Asn Ala Trp Met Thr 
20 25 30 

Trp Val Arg Gin Ser Pro Gly Lys Gly Leu Glu Trp Val Ala Ser He 
35 40 45 

Lys Ser Lys Phe Asp Gly Gly Ser Pro His Tyr Ala Ala Pro Val Glu 
50 55 60 

Gly Arg Phe Ser He Ser Arg Asn Asp Leu Glu Asp Lys Met Phe Leu 
65 70 75 80 

Glu Met Ser Gly Leu Lys Ala Glu Asp Thr Gly Val Tyr Tyr Cys Ala 
85 90 95 

Thr Lys Tyr Pro Arg Tyr Ser Asp Met Met Thr Gly Val Arg Asn His 
100 105 110 

Phe Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val He Val Ser Ser 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 



WO 96/02273 PCTAJS95/08743 

- 214 - 

Leu Glu Glu Ser Gly Gly Gly Leu Val Lys Pro Gly Gly Ser Leu Arg 

15 10 15 

Leu Ser Cys Ala Gly Ser Gly Phe Thr Phe Thr Asn Ala Trp Met Thr 
20 25 30 

Trp Val Arg Gin Ser Pro Gly Lys Gly Leu Glu Trp Val Ala Ser lie 
35 40 45 

Lys Ser Lys Phe Asp Gly Gly Ser Ser His Tyr Pro Gly Pro Val Glu 
50 55 60 

Gly Arg Phe Thr lie Ser Arg Asn Tyr lie Glu Asp Lys Leu Phe Leu 
65 70 75 80 

Glu Met Ser Gly Leu Lys Ala Glu Asp Thr Gly Val Tyr Tyr Cys Ala 
85 90 95 

Thr Lys Tyr Pro Arg Tyr Tyr Asp Met Met Arg Gly Val Arg Asn His 
100 105 110 

Tyr Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val He Val Ser Ser 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) T0POLOCY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly Ala Ser Val Lys 
15 10 15 

Val Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn Phe Val He His 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Gin Arg Phe Glu Trp Met Gly Trp lie 
35 40 . 45 

Asn Pro Tyr Asn Gly Asn Lys Glu Phe Ser Ala Lys Phe Gin Asp Arg 
50 55 60 

Val Thr Phe Thr Ala Asp Thr Ser Ala Asn Thr Ala Tyr Met Glu Leu 
65 70 75 80 
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Arg Ser Leu Arg Ser Ala Asp Thr Ala Val Tyr Tyr Cys Ala Arg Val 
85 90 95 

Gly Pro Tyr Ser Trp Asp Asp Ser Pro Gin Asp Asn Tyr Tyr Met Asp 
100 105 110 

Val Trp Gly Lys Gly Thr Thr Val lie Val Ser Ser 
115 120 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(II) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:67: 

Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly Ala Ser Val Lys 
15 10 15 

Val Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn Phe Val He His 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Gin Arg Phe Glu Trp Met Gly Trp He 
35 40 45 

Asn Pro Tyr Asn Gly Asn Lys Glu Phe Ser Ala Lys Phe Gin Asp Arg 
50 55 60 

Val Thr Phe Thr Ala Asp Thr Asp Ala Asn Thr Ala Tyr Met Glu Leu 
65 70 75 80 

Arg Ser Leu Arg Ser Ala Asp Thr Ala He Tyr Tyr Cys Ala Arg Val 
85 90 95 

Gly Pro Tyr Thr Trp Asp Asp Ser Pro Gin Asp Asn Tyr Tyr Met Asp 
100 105 110 

Val Trp Gly Lys Gly Thr Lys Val He Val Ser Ser 
115 120 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(li) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly Ala Ser Val Lys 
15 10 15 

Val Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn Phe Val lie His 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Gin Arg Phe Glu Trp Met Gly Trp He 
35 40 45 

Asn Pro Tyr Asn Gly Asn Lys Glu Phe Ser Ala Lys Phe Gin Asp Arg 
50 55 60 

Val Thr Phe Thr Ala Asp Thr Asp Ala Asn Thr Ala Tyr Met Glu Leu 
65 70 75 80 

Arg Ser Leu Arg Ser Thr Asp Thr Ala He Tyr Tyr Cys Ala Arg Val 
85 90 95 

Gly Pro Tyr Thr Trp Asp Asp Ser Pro Gin Asp Asn Tyr Tyr Met Asp 
100 105 110 

Val Trp Gly Lys Gly Thr Lys Val He Val Ser Ser 
115- 120 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Leu Glu Glu Ser Gly Gly Gly Leu Val Lys Pro Gly Gly Ser Leu Arg 
15 10 15 

Leu Ser Cys Val Gly Ser Gly Phe Thr Phe Ser Ser Ala Trp Met Ala 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Arg Gly Leu Glu Trp Val Gly Leu He 
35 40 45 
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Lys Ser Lys Ala Asp Gly Glu Thr Thr Asp Tyr Ala Thr Pro Val Lys 
50 55 60 

Gly Arg Phe Ser lie Ser Arg Asn Asn Leu Glu Asp Thr Val Tyr Leu 
65 70 75 80 

Gin Met Asp Ser Leu Arg Ala Asp Asp Thr Ala Val Tyr Tyr Cys Ala 
85 90 95 

Thr Gin Lys Pro Arg Tyr Phe Asp Leu Leu Ser Gly Gin Tyr Arg Arg 
100 105 110 

Val Ala Gly Ala Phe Asp Val Trp Gly His Gly Thr Thr Val Thr Val 
115 120 125 

Ser Pro 
130 

(2) INFORMATION FOR SEQ ID NO: 70: x 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Leu Glu Glu Ser Gly Gly Gly Leu Val Lys Ala Gly Gly Ser Leu Arg 
1 5 10 15 

Leu Ser Cys Val Gly Ser Gly Phe Thr Phe Ser Ser Ala Trp Met Ala 
20 25 30 

Trp Val Gly Gin Ala Pro Gly Arg Gly Leu Glu Trp Val Gly Leu He 
35 40 45 

Lys Ser Lys Ala Asp Gly Glu Thr Thr Asp Tyr Ala Thr Pro Val Lys 
50 55 60 

Gly Arg Phe Ser He Ser Arg Asn Asn Leu Glu Asp Thr Val Tyr Leu 
65 70 75 80 

Gin Met Asp Ser Leu Arg Ala Asp Asp Thr Ala Val Tyr Tyr Cys Ala 
85 90 95 

Thr Gin Lys Pro Arg Tyr Phe Asp Leu Leu Ser Gly Gin Tyr Arg Arg 
100 105 110 
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Val Ala Gly Ala Phe Asp Val Trp Gly His Gly Thr Thr Val Thr Val 
115 120 125 

Ser Pro 
130 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:71: 

Leu Glu Glu Ser Gly Gly Gly Leu lie Lys Pro Gly Gly Ser Leu Arg 
1 5 10 15 

Leu Ser Cys Val Gly Ser Gly Phe Thr Phe Ser Ser Ala Trp Met Thr 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp lie Gly Leu lie 
35 40 45 

Lys Ser Lys Ala Asp Gly Glu Thr Thr Asp Tyr Ala Thr Pro Val Lys 
50 55 60 

Gly Arg Phe Thr lie Ser Arg Asn Asn Leu Glu Asn Thr Val Tyr Leu 
65 70 75 80 

Gin Met Asp Ser Leu Arg Ala Asp Asp Thr Ala Val Tyr Tyr Cys Ala 
85 90 95 

Thr Gin Lys Pro Ser Tyr Tyr Asn Leu Leu Ser Gly Gin Tyr Arg Arg 
100 105 110 

Val Ala Gly Ala Phe Asp Val Trp Gly His Gly Thr Thr Val Thr Val 
115 120 125 

Ser Pro 



(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Leu Clu Glu Ser Gly Glu Ala Val Val Gin Pro Gly Arg Ser Leu Arg 
15 10 15 

Leu Ser Cys Ala Ala Ser Gly Phe He Phe Arg Asn Tyr Ala Met His 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ala Leu He 
35 40 45 

Lys Tyr Asp Gly Arg Asn Lys Tyr Tyr Ala Asp Ser Val Lys Gly Arg 
50 55 60 

Phe Thr He Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met 
65 70 75 80 

Asn Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Arg Asp 
85 90 95 

He Gly Leu Lys Gly Glu His Tyr Asp He Leu Thr. Ala Tyr Gly Pro 
100 105 110 

Asp Tyr Trp Gly Gin Gly Thr Leu Val Thr Val Ser Ser 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Leu Glu Gin Ser Gly Glu Ala Val Val Gin Pro Gly Thr Ser Leu Arg 
15 10 15 

Leu Ser Cys Ala Ala Ser Gly Phe Thr Phe Arg Asn Tyr Ala Met His 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ala Leu He 
35 40 45 
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Lys Tyr Asp Gly Arg Asn Lys Tyr Tyr Ala Asp Ser Val Lys Gly Arg 
50 55 60 

Phe Ser lie Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Glu Met 
65 70 75 80 

Asn Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Arg Asp 
85 90 95 

lie Gly Leu Lys Gly Glu His Tyr Asp He Leu Thr Ala Tyr Gly Pro 
100 105 110 

Asp Tyr Trp Gly Gin Gly Ala Leu Val Thr Val Ser Ser 
115 120 * 125 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 

Leu Glu Gin Ser Gly Glu Ala Val Val Gin Pro Gly Arg Ser Leu Arg 
15 10 15 

Leu Ser Cys Ala Ala Ser Gly Phe He Phe Arg Asn Tyr Ala Met His 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ala Leu He 
35 40 45 

Lys Tyr Asp Gly Arg Asn Lys Tyr Tyr Ala Asp Ser Val Lys Gly Arg 
50 55 60 

Phe Thr He Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met 
65 70 75 80 

Asn Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Arg Asp 
85 90 95 

He Gly Leu Lys Gly Glu His Tyr Asp He Leu Thr Ala Tyr Gly Pro 
100 105 110 



Asp Tyr Trp Gly Gin Gly Thr Leu Val Thr Val Ser Ser 
115 120 125 
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(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 

Leu Glu Glu Ser Gly Glu Ala Val Val Gin Pro Gly Thr Ser Leu Arg 
15 10 15 

Leu Ser Cys Ala Ala Ser Gly Phe Thr Phe Arg Asn Tyr Ala Met His 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ala Leu He 
35 40 45 

Lys Tyr Asp Gly Arg Asn Lys Tyr Tyr Ala Asp Ser Val Lys Gly Arg 
50 55 60 

Phe Ser He Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Glu Met 
65 70 75 80 

Asn Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Arg Asp 
85 90 95 

He Gly Leu Lys Gly Glu His Tyr Asp He Leu Thr Ala Tyr Gly Pro 
100 105 110 

Asp Tyr Trp Gly Gin Gly Ala Leu Val Thr Val Ser Ser 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Leu Glu Gin Ser Gly Glu Ala Val Val Gin Pro Gly Arg Ser Leu Arg 
15 10 15 
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Leu Ser Cys Ala Ala Ser Gly Phe Thr Phe Arg Asn Tyr Ala Met His 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ala Leu lie 
35 40 45 

Lys Tyr Asp Gly Arg Asn Lys Tyr Tyr Ala Asp Ser Val Lys Gly Arg 
50 55 60 

Phe Thr lie Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met 
65 70 75 80 

Asn Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Arg Asp 
85 90 95 

lie Gly Leu Lys Ala Glu His Tyr Asp lie Leu Thr Ala Tyr Gly Pro 
100 105 110 

Asp Tyr Trp Gly Gin Gly Thr Leu Val Thr Val Ser Ser 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Leu Glu Gin Ser Gly Glu Ala Val Val Glh Pro Gly Arg Ser Leu Arg 
15 10 15 

Leu Ser Cys Ala Ala Ser Gly Phe lie Phe Arg Asn Tyr Ala Met His 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ala Leu He 
35 40 45 

Lys Tyr Asp Gly Arg Asn Lys Tyr Tyr Ala Asp Ser Val Lys Gly Arg 
50 55 60 

Phe Thr He Ser Arg Asp Asn Ser Lys Asn Thr Leu Tyr Leu Gin Met 
65 70 75 80 

Asn Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Arg Asp 
85 90 95 



WO 96102173 



- 223 - 



PCT/US95/08743 



He Gly Leu Lys Gly Glu His Tyr Asp He Leu Thr Ala Tyr Gly Pro 
100 105 110 

Asp Tyr Trp Gly Gin Gly Thr Leu Val Thr Val Ser Ser 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 78: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Leu Glu Gin Ser Gly Gly Gly Val Val Lys Pro Gly Gly Ser Leu Arg 
15 10 15 

Leu Ser Cys Glu Gly Ser Gly Phe Thr Phe Pro Asn Ala Trp Met Thr 
20 25 30 

Trp Val Arg Gin Ser Pro Gly Lys Gly Leu Glu Trp Val Ala Ser He 
35 40 45 

Lys Ser Lys Phe Asp Gly Gly Ser Pro His Tyr Ala Ala Pro Val Glu 
50 55 60 

Gly Arg Phe Thr He Ser Arg Asn Asp Leu Glu Asp Lys Val Phe Leu 
65 70 75 80 

Gin Met Asn Gly Leu Lys Ala Glu Asp Thr Gly Val Tyr Tyr Cys Ala 
85 90 95 

Thr Arg Tyr Pro Arg Tyr Ser Glu Met Met Gly Gly Val Arg Lys His. 
100 105 110 

Phe Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val Ser Val Ser Ser 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Leu Glu Glu Ser Gly Gly Gly Val Val Lys Pro Gly Gly Ser Leu Arg 
1 5 10 15 

Leu Ser Cys Glu Gly Ser Gly Phe Thr Phe Pro Asn Ala Trp Met Thr 
20 25 30 

Trp Val Arg Gin Ser Pro Gly Lys Gly Leu Glu Trp Val Ala Ser lie 
35 40 45 

Lys Ser Lys Phe Asp Gly Gly Ser Pro His Tyr Ala Ala Pro Val Glu 
50 55 60 

Gly Arg Phe Thr lie Ser Arg Asn Asp Leu Glu Asp Lys Val Phe Leu 
65 70 75 80 

Gin Met Asn Gly Leu Lys Ala Glu Asp Thr Gly Val Tyr Tyr Cys Ala 
85 90 95 

Thr Arg Tyr Pro Arg Tyr Ser Glu Met Met Gly Gly Val Arg Lys His 
100 105 110 

Phe Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val Ser Val Ser Ser 
115 120 125 



(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 122 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Leu Glu Glu Ser Gly Gly Gly Leu Val Gin Pro Gly Arg Ser Leu Arg 
1 5 10 15 

Val Ser Cys Glu Ala Ser Gly Phe Thr Phe Ser Ser Tyr Glu Met Asn 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ser Gin He 
35 40 45 

Ser Ser Ser Gly Ser Arg Thr Tyr Tyr Ala Asp Ser Val Lys Gly Arg 
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50 55 60 

Fhe Thr lie Ser Arg Asp Asn Ala Lys Asn Ser Leu Tyr Leu Glu Het 
65 70 75 80 

Thr Ser Leu Arg Val Asp Asp Thr Ala Val Tyr Tyr Cys Ala Arg Gly 
85 90 95 

Arg Arg Leu Val Thr Phe Gly Gly Val Val Ser Gly Gly Asn lie Trp 
100 105 110 

Gly Gin Gly Thr Met Val Thr Val Ser Ser 
115 120 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Leu Glu Gin Ser Gly Gly Gly Val Val Gin Pro Gly Arg Ser Leu Arg 
15 10 15 

Leu Ser Cys Ala Gly Ser Gly Phe Asn Phe Ser Asp Asp Thr Met His 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ala Val He 
35 40 45 

Ser Tyr Glu Gly Ser Asp Lys Tyr Tyr Ala Asp Ser Val Lys Gly Arg 
50 55 60 

Phe Thr He Ser Arg Asp Asn Ser Glu Asn Thr Leu Tyr Leu Gin Met 
65 70 75 80 

Asp Ser Leu Arg Ala Asp Asp Thr Ala Leu Tyr Tyr Cys Ala Arg Asn 
85 90 95 

Thr Arg Glu Asn He Glu Ala Asp Gly Thr Ala Tyr Tyr Ser Tyr Tyr 
100 105 110 

Met Asp Val Trp Gly Lys Gly Thr Thr Val Thr Val Ser Ser 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 82: 



WO 96/02273 



- 226 - 



PCIYUS95/08743 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
15 10 15 

Val Thr lie Thr Cys Arg Ala Ser Gin Gly lie Ser Asn Tyr Leu Ala 
20 25 30 

Trp Tyr Gin Gin Lys Pro Gly Lys Val Pro Arg Leu Leu lie Tyr Ala 
35 40 45 

Ala Ser Thr Leu Gin Pro Gly Val Pro Ser Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Ser Leu Gin Pro Glu Asp 
65 70 75 80 

Val Ala Thr Tyr Tyr Cys Gin Lys Tyr Asn Ser Ala Pro Arg Thr Phe 
85 90 95 

Gly Gin Gly Thr Lys Val Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser He Gly Asp Arg 
15 10 15 

Val Thr He Thr Cys Arg Ala Ser Gin Gly He Asn Asn Tyr Leu Ala 
20 25 30 

Trp Tyr Gin Gin Arg Pro Gly Lys Val Pro Arg Leu Leu He Tyr Ala 
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35 40 45 

Ala Ser Thr Leu Gin Ser Gly Val Pro Thr Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Gin Pro Glu Asp 
65 70 75 80 

Val Ala Thr Tyr Tyr Cys Gin Lys Tyr Asn Ser Val Pro Arg Thr Phe 
85 90 95 

Gly Gly Gly Thr Lys Val Glu He Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 84: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
15 10 15 

Val Thr He Thr Cys Arg Ala Ser Gin Gly He Ser Asn Tyr Leu Ala 
20 25 30 

Trp Tyr Gin Gin Lys Pro Gly Lys Val Pro Lys Leu Leu He Tyr Ala 
35 40 45 

Ala Ser Thr Leu Gin Ser Gly Val Pro Ser Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Gin Pro Glu Asp 
65 70 75 80 

Val Ala Thr Tyr Tyr Cys Gin Lys Tyr Asn Ser Ala Pro Arg Thr Phe 
85 90 95 

Gly Gin Gly Thr Lys Val Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser lie Gly Asp Arg 
15 10 15 

Val Thr He Thr Cys Arg Ala Ser Gin Gly He Asn Asn Tyr Leu Ala 
20 25 . 30 

Trp Tyr Gin Gin Arg Pro Gly Lys Ala Pro Asn Leu Leu He Tyr Ala 
35 40 45 

Ala Ser Thr Leu Gin Ser Gly Val Pro Pro Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Gin Pro Glu Asp 
65 70 75 80 

Val Ala Thr Tyr Tyr Cys Gin Lys Tyr Asn Ser Val Pro His Thr Phe 
85 90 95 

Gly Gly Gly Thr Lys Val Glu He Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg 
15 10 15 

Ala Thr Leu Ser Cys Arg Ala Ser Gin Ser Val He Ser Asn Tyr Leu 
20 25 30 

Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu He Tyr 
35 40 45 
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Gly Val Ser Asn Arg Ala Thr Gly lie Pro Asp Arg Phc Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Arg Leu Glii Pro Glu 
65 70 75 80 

Asp Phe Ala Val Tyr Ser Cys Gin Gin Tyr Gly Thr Ser Pro Trp Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Val Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg 
15 10 15 

Ala Thr Leu Ser Cys Arg Ala Ser Gin Ser Val Ser Asn Asn Tyr Leu 
20 25 30 

Ala Trp Tyr Gin Gin Arg Pro Gly Gin Ala Pro Arg Leu Leu He Tyr 
35 40 45 

Gly Ala Ser Asn Arg Ala Thr Gly He Pro Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Ala Phe Thr Leu Thr He Ser Ser Leu Gin Pro Glu 
65 70 75 80 

Asp Val Ala He Tyr Tyr Cys Gin Gin Tyr His Ser Ser Pro Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Glu He Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D> TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg 
15 10 15 

Ala Thr Leu Ser Cys Arg Ala Ser His Arg Val Asn Asn Asn Phe Leu 
20 25 30 

Ala Trp Tyr Gin Gin Lys Pro Gin Ala Pro Arg Leu Leu lie Ser Gly 
35 40 45 

Ala Ser Thr Arg Ala Thr Gly He Pro Asp Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Thr Asp Phe Thr Leu Thr He Ser Arg Leu Glu Pro Asp Asp 
65 70 75 80 

Phe Ala Val Tyr Tyr Cys Gin Gin Tyr Gly Asp Ser Pro Leu Tyr Ser 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:89: 

Glu Leu Thr Gin Ser Pro Ala Ser Val Ser Ala Ser Val Gly Asp Thr 
15 10 15 

Val Thr He Thr Cys Arg Ala Ser Gin Asp He His Asn Trp Leu Ala 
20 25 30 

Trp Tyr Gin Gin Gin Pro Gly Lys Ala Pro Lys Leu Leu He Tyr Ala 
35 40 45 

Ala Ser Ser Leu Gin Ser Gly Val Pro Ser Arg Phe Ser Gly Arg Gly 
50 55 60 
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Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Ser Leu Gin Pro Glu Asp 
65 70 75 80 

Phe Ala Thr Tyr Tyr Cys Gin Gin Gly Asn Ser Phe Pro Lys Phe Gly 
85 90 95 

Pro Gly Thr Val Val Asp lie Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 
(fi) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:90: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg 
15 10 15 

Ala Thr Leu Ser Cys Arg Ala Ser Gin Ser Leu Ser Asn Asn Tyr Leu 
20 25 30 

Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu lie Tyr 
35 40 45 

Gly Ser Ser Thr Arg Gly Thr Gly lie Pro Asp Arg Phe Ser Gly Gly 
50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Arg Leu Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Val Tyr Tyr Cys Gin His Tyr Gly Asn Ser Val Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Glu lie Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 
(fi) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 



WO 96/02273 



232 



PCT/US95/08743 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Gin Ser Pro Asp Thr Leu Ser Leu Asn Pro Gly Glu Arg Ala Thr Leu 
15 10 15 

Ser Cys Arg Ala Ser His Arg He Ser Ser Lys Arg Leu Ala Trp Tyr 
20 25 30 

Gin His Lys Arg Gly Gin Ala Pro Arg Leu Leu He Tyr Val Cys Pro 
35 40 45 

Asn Arg Ala Gly Gly Val Pro Asp Arg Phe Ser Gly Ser Gly Ser Gly 
50 55 60 

Thr Asp Phe Thr Leu Thr Tyr Ser Arg Leu Glu Pro Glu Asp Phe Ala 
65 70 75 80 

Met Tyr Tyr Cys Gin Tyr Tyr Gly Gly Ser Ser Tyr Thr Phe Gly Gin 
85 90 95 

Gly Thr Lys Val Glu He Thr Arg 
100 

(2) INFORMATION FOR SEQ ID NO: 92: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Gin Ser Pro Ser His Leu Ser Leu Ser Pro Gly Glu Arg Ala He Leu 
1 5 10 15 

Ser Cys Arg Ala Ser Gin Arg Val Ser Ala Pro Tyr Leu Ala Trp Tyr 
20 25 30 

Gin Gin Arg Pro Gly Gin Ala Pro Arg Leu Val He Tyr Gly Ala Ser 
35 40 45 

Thr Arg Ala Thr Asp He Pro Asp Arg Phe Ser Gly Ser Gly Ser Gly 
50 55 60 

Thr Asp Phe Thr Leu Thr He Ser Arg Leu Glu Pro Glu Asp Phe Ala 
65 70 75 80 
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He Tyr Tyr Cys Gin Val Tyr Gly Gin Ser Pro Val Leu Phe Gly Gin 

85 90 95 

Gly Thr Lys Leu Glu Met Lys Arg 
100 

(2) INFORMATION FOR SEQ ID N0:93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 93: 

Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Asp Arg Ala Thr Leu 
15 10 15 

Ser Cys Arg Ala Ser Gin Ser Leu Ser Ser Ser Phe Leu Ala Trp Tyr 
20 25 30 

Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu lie Tyr Ser Ala Ser 
35 40 45 

Met Arg Ala Thr Gly He Pro Asp Arg Phe Arg Gly Ser Val Ser Gly 
50 55 60 

Thr Asp Phe Thr Leu Thr He Thr Arg Leu Glu Pro Glu Asp Phe Ala 
65 70 75 80 

Val Tyr Tyr Cys Gin Arg Phe Gly Thr Ser Pro Leu Tyr Thr Phe Gly 
85 90 95 

Gin Gly Thr Lys Leu Glu Met Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 94: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
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Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg Ala Thr Leu 
15 10 15 

Ser Cys Arg Ala Ser Gin Ser Phe Ser Ser Asn Phe Leu Ala Trp Tyr 
20 25 30 

Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu lie Tyr Val His Pro 
35 40 45 

Asn Arg Ala Thr Gly Val Pro Asp Arg Phe Ser Gly Ser Gly Ser Gly 
50 55 60 

Thr Asp Phe Thr Leu Thr lie Arg Arg Leu Glu Pro Glu Asp Phe Ala 
65 70 75 80 

Val Tyr Tyr Cys Gin Gin Tyr Gly Ala Ser Leu Val Ser Phe Gly Pro 
85 90 95 

Gly Thr Lys Val His lie Lys Arg 
100 

(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg 
1 5 10 15 

Ala Thr Phe Ser Cys Arg Ser Ser His Ser lie Arg Ser Arg Arg Val 
20 25 30 

Ala Trp Tyr Gin His Lys Pro Gly Gin Ala Pro Arg Leu Val He His 
35 40 45 

Gly Val Ser Asn Arg Ala Ser Gly He Ser Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Thr He Thr Arg Val Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Leu Tyr Tyr Cys Gin Val Tyr Gly Ala Ser Ser Tyr Thr 
85 90 95 
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Phe Gly Gin Gly Thr Lys Leu Glu Arg Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 96: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Thr Pro Gly Glu Arg 
1 5 10 15 

Ala Thr Leu Ser Cys Arg Thr Ser His Ser He Arg Ser Arg Arg Leu 
20 25 30 

Ala Trp Tyr Gin Val Lys Gly Gly Gin Ala Pro Arg Leu Leu He Tyr 
35 40 45 

Gly Val Ser Asn Arg Ala Gly Gly He Pro Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Thr He Ser Arg Leu Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Val Tyr Tyr Cys Gin Gin Tyr Gly Ser Ser Arg Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Glu He Lys Arg Thr 
- 100 105 

(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:97: 

Glu Leu Thr Gin Ala Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg 
15 10 15 
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Ala Thr Phe Ser Cys Arg Ser Ser His Ser lie Arg Ser Arg Arg Val 

20 25 30 

Arg Trp Tyr Gin His Lys Pro Gly Gin Ala Pro Arg Leu Val lie His 
35 40 45 

Gly Val Ser Asn Arg Ala Ser Gly lie Ser Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Thr He Thr Arg Val Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Leu Tyr Tyr Cys Gin Val Tyr Gly Ala Ser Ser Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Glu Arg Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:98: 

Glu Leu Thr Gin Ala Pro Gly Thr Leu Ser Leu Ser Pro Gly Asp Arg 
15 10 15 

Ala Thr Phe Ser Cys Arg Ser Ser His Asn He Arg Ser Arg Arg Val 
20 25 30 

Ala Trp Tyr Gin His Lys Pro Gly Gin Ala Pro Arg Leu Val He His 
35 40 45 

Gly Val Ser Asn Arg Ala Ser Gly He Ser Asp Arg Phe Ser Gly Ser 

50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Thr He Thr Arg Leu Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Leu Tyr Tyr Cys Gin Val Tyr Gly Ala Ser Ser Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Asp Phe Lys Arg Thr 
100 105 
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(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 99: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg 
1 5 10 15 

Ala Thr Leu Ser Cys Arg Ala Gly Gin Ser lie Ser Ser Asn Tyr Leu 
20 25 30 

Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu lie Tyr 
35 40 45 

Gly Ala Ser Asn Arg Ala Thr Gly lie Pro Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Ser lie Ser Arg Leu Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Val Tyr Tyr Cys Gin Gin Tyr Gly Thr Ser Pro Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Gin Leu Asp lie Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 

Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg Ala Thr Leu 
15 10 15 

Ser Cys Arg Ala Ser Gin Ser Leu Ser Asn Asn Tyr Leu Ala Trp Tyr 
20- 25 30 
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Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu lie Tyr Gly Ser Ser 

35 40 45 

Thr Arg Ala Thr Gly He Pro Asp Arg Phe Ser Gly Gly Gly Ser Gly 
50 55 60 

Thr Asp Phe Thr Leu Thr He Ser Arg Leu Glu Pro Glu Asp Phe Ala 
65 70 75 80 

Val Tyr Tyr Cys Gin Gin Tyr Gly Asn Ser Val Tyr Thr Phe Gly Gin 
85 90 95 

Gly Thr Lys Leu Glu He Lys Arg 
100 

(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 106 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
15 10 15 

Val Thr He Thr Cys Arg Thr Ser Gin Gly He Ser Asn Tyr Leu Ala 
20 25 30 

Trp Tyr Gin Gin Lys Pro Gly Lys Val Pro Lys Leu Leu He Tyr Gly 
35 40 45 

Ala Ser Thr Leu Gin Ser Gly Gly Pro Ser Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Thr Asp Phe Thr Leu Thr He Asn Ser Leu Gin Pro Glu Asp 

65 70 75 80 

Val Ala Thr Tyr Ser Cys Gin Asn Tyr Asp Ser Ala Pro Trp Thr Phe 
85 90 95 

Gly Gin Gly Thr Lys Val Asp He Lys Arg 
100 105 



(2) INFORMATION FOR SEQ ID NO: 102: 
(i) SEQUENCE CHARACTERISTICS: 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
15 10 15 

Val Thr lie Thr Cys Arg Ala Ser Gin Ser lie Ser Asn Tyr Leu Asn 
20 25 30 



(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



Trp Tyr Gin Gin Lys Pro Gly Lys 
35 40 

Ala Ser Ser Leu Gin Arg Gly Val 
50 55 

Ser Gly Thr Asp Phe Thr Leu Ser 
65 70 

Phe Ala Thr Tyr Tyr Cys Gin Gin ! 
85 

Phe Gly Gly Gly Thr Lys Val Glu He Lys Arg Thr 
100 105 



(2) INFORMATION FOR SEQ ID NO: 103: 



Ala Pro Lys Leu Leu He Tyr Ala 
45 

Pro Ser Arg Phe Ser Gly Ser Gly 
60 

He Ser Ser Leu Gin Pro Glu Asp 
75 80 

Tyr Ser He Pro Pro Leu Thr 
90 95 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
1 5 10 15 

Val Thr He Thr Cys Arg Ala Ser Gin Asn He Asn Asn Tyr Leu Asn 
20 25 30 

Trp Tyr Gin Gin Lys Pro Gly Glu Ala Pro Lys Leu Leu He His Thr 
35 40 45 
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Ala Phe Asn Leu Gin Ser Gly Val Pro Ser Arg Phe Ser Gly Thr Ala 
50 55 60 

Ser Gly Thr Glu Phe Thr Leu Thr He Arg Ser Leu Gin Pro Glu Asp 
65 70 75 80 

Phe Ala Thr Tyr Tyr Cys Gin Gin Ser Tyr Ser Thr Pro Tyr Thr Phe 
85 90 95 

Gly Gin Gly Thr Lys Val Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 104: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(it) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
15 10 15 

Val Thr He Thr Cys Arg Ala Ser Gin Ser He Ser Ser Tyr Leu Asn 
20 25 30 

Trp Tyr Gin Gin Lys Pro Gly Lys Ala Pro Lys Leu Leu He Tyr Ala 
35 40 45 

Ala Ser Ser Leu Gin Ser Gly Val Pro Ser Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Gin Pro Glu Asp 
65 70 75 80 

Phe Ala Thr Tyr Tyr Cys Gin Gin Ser Tyr Ser Thr Pro Tyr Thr Phe 
85 90 95 

Gly Gin Gly Thr Lys Leu Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 105: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(xl) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
15 10 15 

Val Thr He Thr Cys Arg Ala Ser Gin Ser He Ser Ser Tyr Leu Asn 
20 25 30 

Trp Tyr Gin Gin Lys Pro Gly Lys Ala Pro Lys Leu Leu He Tyr Ala 
35 40 45 

Ala Ser Ser Leu Gin Ser Gly Val Pro Ser Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Gin Pro Glu Asp 
65 70 75 80 

Phe Ala Thr Tyr Tyr Cys Gin Gin Ser Tyr Ser Thr Pro Gin Thr Phe 
85 90 95 

Gly Gin Gly Thr Lys Leu Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg Val Thr He 
15 10 15 

Thr Cys Arg Ala Ser Gin Thr He Ser Ser Tyr Leu Asn Trp Tyr Gin 
20 25 30 

Gin Lys Pro Gly Lys Ala Pro Lys Leu Leu He Tyr Ala Ala Ser Ser 
35 40 45 



Leu Gin 
50 



Ser Gly Val Pro Ser 
55 



Arg Phe Ser Gly Gly Gly Ser Gly Thr 
60 
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Asp Phe Thr Leu Thr He Ser Ser Leu Gin Pro Glu Asp Phe Ala Thr 

65 70 75 80 

Tyr Tyr Cys Gin Gin Ser Tyr Ser Thr Pro Tyr Thr Phe Gly Gin Gly 
85 90 95 

Thr Lys Leu Glu He Lys Arg Thr 
100 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
15 10 15 

Val Thr He Thr Cys Gin Ala Ser Gin Asp He Arg Asn Tyr Leu Asn 
20 25 30 

Trp Tyr Gin Gin Lys Pro Gly Lys Ala Pro Lys Leu Leu He Tyr Asp 
35 40 45 

Ala Ser Asn Ser Glu Thr Gly Val Pro Ser Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Arg Asp Phe Thr Phe Thr He Ser Ser Leu Gin Pro Glu Asp 
65 70 75 80 

Val Ala Thr Tyr Tyr Cys Gin Gin His Gin Asn Val Pro Leu Thr Phe 
85 90 95 

Gly Gly Gly Thr Lys Val Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
15 10 15 

Val Thr lie Thr Cys Gin Ala Ser Gin Asp lie Ser Asn His Leu Asn 
20 25 30 

Trp Tyr Gin Gin Lys Pro Gly Lys Ala Pro Lys Leu Leu lie Tyr Asp 
35 40 45 

Ala Ser Asn Leu Glu Thr Gly Val Pro Ser Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Thr Asp Phe Thr Phe Thr lie Ser Ser Leu Gin Pro Glu Asp 
65 70 75 80 

lie Ala Thr Tyr Tyr Cys Gin Gin Tyr Asp Asn Leu Pro Leu Thr Phe 
85 90 95 

Gly Gly Gly Thr Lys Val Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly Asp Arg 
15 10 15 

He Thr He Thr Cys Arg Ala Ser Gin Thr He Asn Asn Tyr Leu Asn 
20 25 30 

Trp Tyr Gin Gin Lys Pro Gly Lys Ala Pro Lys Leu Leu He Tyr Gly 
35 40 45 

Ala Ser Asn Leu Gin Ser Gly Val Pro Ser Arg Phe Ser Gly Ser Gly 
50 55 60 

Ser Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Gin Pro Glu Asp 
65 70 75 80 
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Phe Ala Thr Tyr Phe Cys Gin Gin Ser Tyr Asn Thr Pro Pro Trp Thr 

85 90 95 

Phe Gly Gin Gly Thr Lys Val Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 110: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 108 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg 
15 10 15 

Ala Thr Leu Ser Cys Arg Ala Ser Gin Arg Val Asn Ser Asn Tyr Leu 
20 25 30 

Ala Trp Tyr Gin Gin Lys Pro Gly Gin Thr Pro Arg Val Val He Tyr 
35 40 45 

Ser Thr Ser Arg Arg Ala Thr Gly Val Pro Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Thr He Ser Arg Leu Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Val Tyr Tyr Cys Gin Gin Phe Gly Asp Ala Gin Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Glu He Lys Arg Thr 
100 105 

(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 
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Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Arg Val Asn Ser Asn 
15 10 15 

Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Thr Pro Arg Val Val 
20 25 30 

He Tyr Ser Thr Ser Arg Arg Ala Thr Gly Val Pro Asp Arg Phe Ser 
35 40 45 

Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr He Ser Arg Leu Glu 
50 55 60 

Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gin Gin Phe Gly Asp Ala Gin 
65 70 75 80 

Tyr Thr Phe Gly Gin Gly Thr Lys Leu Glu He Lys Arg 
85 90 

(2) INFORMATION FOR SEQ ID NO: 112: 

. (1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

Thr Gin Ser Pro Ser Ser Val Ser Ala Ser Val Gly Asp Thr Val Thr 
15 10 15 

Phe Thr Cys Arg Ala Ser Gin Asp He Arg Asn Tyr Leu Asn Trp Tyr 
20 25 30 

His Gin Lys Pro Gly Lys Ala Pro Lys Leu Leu He Ser Asp Ala Ser 
35 40 45 

Asp Leu Glu He Gly Val Pro Ser Arg Phe Ser Gly Ser Gly Ser Ala 
50 55 60 

Thr Tyr Phe Ser Phe Thr He Ser Ser Leu Gin Pro Glu Asp He Gly 
65 70 75 80 

Thr Tyr Tyr Cys Gin Gin Tyr Ala Asp Leu He Thr Phe Gly Gly Gly 
85 90 95 



Thr Lys Val Glu He Lys Arg Thr 
100 
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(2) INFORMATION FOR SEQ ID NO: 113: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

Ser Pro Gly Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Ser Val 
15 10 15 

Gly Thr Asn Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg 
20 25 30 

Leu Leu lie Phe Asp Ala Ser Thr Arg Asp Thr Tyr lie Pro Asp Thr 
35 40 45 

Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Ala Leu Thr lie Ser Ser 
50 55 60 

Leu Gin Ser Glu Asp Phe Gly Phe Tyr Tyr Cys Gin Gin Tyr Asp Asn 
65 70 75 80 

Trp Pro Pro Thr Phe Gly Gin Gly Thr Lys Leu Glu Val Lys Arg Thr 
85 90 95 



(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Asp Arg 
15 10 15 

Ala Thr Phe Ser Cys Arg Ser Ser His Asn lie Arg Ser Arg Arg Val 
20 25 30 

Ala Trp Tyr Gin His Lys Pro Gly Gin Ala Pro Arg Leu Val He His 
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35 



45 



Gly Val Ser Asn Arg Ala Ser Gly He Ser Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Thr He Thr Arg Leu Glu Pro Glu 

65 70 75 80 

Asp Phe Ala Leu Tyr Tyr Cys Gin Val Tyr Gly Ala Ser Ser Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Asp Phe Lys Arg 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly Glu Arg 
15 10 15 

Ala Thr Phe Ser Cys Arg Ser Ser His Asn He Arg Ser Arg Arg Val 
20 25 30 

Ala Trp Tyr Gin His Lys Pro Gly Gin Ala Pro Arg Leu Val He His 
35 40 45 

Gly Val Ser Asn Arg Ala Thr Gly He Ser Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Thr He Thr Arg Leu Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Leu Tyr Tyr Cys Gin Val Tyr Gly Ala Ser Ser Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Asp Phe Lys Arg 



100 



105 



100 



105 



(2) INFORMATION FOR SEQ ID NO: 116: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 
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(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

Glu Leu Thr Gin Ser Pro Asp Thr Leu Ser Leu Asn Val Gly Glu Arg 
15 10 15 

Ala Thr Leu Ser Cys Arg Ala Ser His Arg He Ser Ser Arg Arg Leu 
20 25 30 

Ala Trp Tyr Gin His Lys Arg Gly Gin Ala Pro Arg Leu Leu He Tyr 
35 40 45 

Gly Val Ser Ser Arg Ala Gly Gly Val Pro Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Ser Leu Thr He Ser Arg Leu Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Met Tyr Tyr Cys Gin Thr Tyr Gly Gly Ser Ser Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Val Asp He Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 117 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

Glu Leu Thr Gin Ser Pro Asp Thr Leu Ser Leu Asn Ala Gly Glu Arg 
15 10 15 

Ala Thr Leu Ser Cys Arg Ala Ser His Arg He Ser Ser Arg Arg Leu 
20 25 30 

Ala Trp Tyr Gin His Lys Arg Gly Gin Ala Pro Arg Leu Leu He Tyr 
35 40 45 
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Gly Val Ser Asn Arg Ala Gly Gly Val Pro Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Ser Leu Thr lie Ser Arg Leu Glu Pro Glu 
65 70 75 80 

Asp Phe Ala lie Tyr Tyr Cys Gin Thr Tyr Gly Gly Ser Ser Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Thr Val Asp He Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Glu Leu Thr Gin Ser Pro Asp Thr Leu Ser Leu Asn Thr Gly Glu Arg 
15 10 15 

Ala Thr Leu Ser Cys Arg Ala Ser His Arg He Gly Ser Arg Arg Leu 
20 25 30 

Ala Trp Tyr Gin His Arg Arg Gly Gin Ala Pro Arg Leu Leu He Tyr 
35 AO 45 

Gly Val Ser Asn Arg Ala Gly Gly Val Pro Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Gly Thr Asp Phe Thr Leu Thr He Ser Arg Leu Glu Pro Glu 
65 70 75 80 

Asp Phe Ala He Tyr Tyr Cys Gin Thr Tyr Gly Gly Ser Ser Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Val Asp He Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 119: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Thr Pro Gly Glu Arg 
15 10 15 

Ala lie Leu Ser Cys Lys Thr Ser His Asn lie Trp Ser Arg Arg Leu 
20 25 30 

Ala Trp Tyr Gin Leu Lys Ser Gly Gin Ala Pro Arg Leu Leu lie Tyr 
35 40 45 

Gly Val Ser Lys Arg Ala Gly Gly lie Pro Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Ala Thr Asp Phe Thr Leu Thr lie Ser Arg Val Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Val Tyr Tyr Cys Gin Thr Tyr Gly Gly Ser Ala Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Asp He Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Thr Pro Gly Glu Arg 
15 10 15 

Ala He Leu Ser Cys Lys Thr Ser His Asn He Trp Ser Arg Arg Leu 
20 25 30 

Ala Trp Tyr Gin Leu Lys Ser Gly Gin Ala Pro Arg Leu Leu He Tyr 
35 40 45 

Gly Val Ser Lys Arg Ala Gly Gly He Pro Asp Arg Phe Ser Gly Ser 

50 55 60 
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Gly Ser Ala Thr Asp Phe Thr Leu Thr He Ser Arg Val Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Val Tyr Tyr Cys Gin Thr Tyr Gly Gly Ser Ala Tyr Thr 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Glu He Lys Arg 
100 105 

(2) INFORMATION FOR SEQ ID N0:121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Ser Thr Pro Gly Glu Arg 
15 10 15 

Ala He Leu Ser Cys Lys Thr Ser His Asn He Trp Ser Arg Arg Leu 
20 25 30 

Ala Trp Tyr Gin Val Lys Ser Gly Leu Pro Pro Arg Leu Leu He His 
35 40 45 

Gly Val Ser. Arg Arg Ala Gly Gly He Pro Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Ala Arg Asp Phe Thr Leu Thr He Ser Arg Leu Glu Pro Ala 
65 70 75 80 

Asp Phe Ala Val Tyr Tyr Cys Gin Thr Tyr Gly Gly Ser Ser Tyr Ser 
85 90 95 

Phe Gly Gin Gly Thr Lys Leu Asp Phe Asn Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE : amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Asn Pro Gly Glu Arg 
15 10 15 

Ala Val Leu Ser Cys Arg Thr Ser Arg Asn lie Trp Ser Arg Arg Leu 
20 25 30 

Ala Trp Tyr Gin Val Arg Arg Gly Gin Ala Pro Arg Leu Leu lie His 
35 40 45 

Gly Val Ser Lys Arg Ala Gly Gly Val Pro Asp Arg Phe Ser Gly Ser 
50 55 60 

Gly Ser Ala Arg Asp Phe Thr Leu Thr lie Ser Arg Leu Glu Pro Glu 
65 70 75 80 

Asp Phe Ala Val Tyr Phe Cys Gin Thr Tyr Gly Gly Ser Ser Tyr Thr 
85 90 95 

Phe Gly Gin Gly Asn Lys Leu Asp He Arg Arg 
100 105 

(2) INFORMATION FOR SEQ ID NO: 123: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

Gin Val Lys Leu Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly 
15 10 15 

Ala Ser Val Lys Val Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn 
20 25 30 

Phe Val Leu His Trp Ala Arg Gin Ala Pro Gly His Arg Pro Glu Trp 
35 40 45 

Met Gly Trp He Asn Pro Ala Asn Gly Val Thr Glu He Pro Pro Lys 
50 55 60 

Phe Gin Asp Arg Val Ser Leu Thr Arg Asp Thr Ser Ala Gly Thr Val 
65 70 75 80 
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Tyr Leu Glu Leu Thr Asn Leu Arg Phe Ala Asp Thr Ala Val Tyr Tyr 

85 90 95 

Cys Ala Arg Val Gly Glu Trp Thr Trp Asp Asp Ser Pro Glh Asp Asn 
100 105 110 

Tyr Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val Thr Val 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

Gin Val Lys Leu Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly 
15 10 15 

Ala Ser Val Lys Val Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn 
20 25 30 

Phe Val Leu His Trp Ala Arg Gin Ala Pro Gly His Arg Pro Glu Trp 
35 40 45 

Met Gly Trp lie Asn Pro Ala Asn Gly Val Thr Glu lie Ser Pro Lys 
50 55 60 

Phe Gin Asp Arg Val Ser Leu Thr Gly Asp Thr Ser Ala Ser Thr Val 
65 70 75 80 

Tyr Leu Glu Leu Arg Asn Leu Arg Phe Ala Asp Thr Ala Val Tyr Tyr 
85 90 95 

Cys Ala Arg Val Gly Glu Trp Thr Trp Asp Asp Ser Pro Gin Asp Asn 
100 105 110 

Tyr Tyr Met Asp Val Trp Gly Arg Gly Thr Thr Val Thr 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 125: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Gin Val Lys Leu Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly 
15 10 15 

Ala Ser Val Lys Val Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn 
20 25 30 

Phe Val Leu His Trp Ala Arg Gin Ala Pro Gly His Arg Pro Glu Trp 
35 40 45 

Met Gly Trp He Asn Pro Ala Asn Gly Val Thr Glu He Ser Pro Lys 
50 55 60 

Phe Gin Asp Arg Val Ser Leu Thr Gly Asp Thr Ser Ala Ser Thr Val 
65 70 75 80 

Tyr Leu Glu Leu Arg Ser Leu Arg Phe Ala Asp Thr Ala Val Tyr Tyr 
85 90 95 

Cys Ala Arg Val Gly Glu Trp Thr Trp Asp Asp Ser Pro Gin Asp Asn 
100 105 110 

Tyr Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val 
115 120 

(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

Gin Val Lys Leu Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly 
15 10 15 

Ala Ser Val Lys He Ser Cys Gin Ala Ser Gly Tyr Arg Phe Thr Asn 
20 25 30 

Phe Val Leu His Trp Ala Arg Gin Ala Pro Gly Gin Arg Pro Glu Trp 
35 40 45 
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Met Gly Trp Phe Asn Pro Ala Asn Gly lie Lys Glu lie Ser Pro Lys 

50 55 60 

Phe Gin Asp Arg Val Ser Phe Thr Gly Asp Thr Ser Ala Ser Thr Ala 
65 70 75 80 

Tyr Val Glu Leu Arg Asn Leu Arg Ser Ala Asp Thr Ala Val Tyr Tyr 
85 90 95 

Cys Ala Arg Val Gly Pro Trp Thr Trp Asp Asp Ser Pro Gin Asp Asn 
100 105 110 

Tyr Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val 
115 120 

(2) INFORMATION FOR SEQ ID NO: 127: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Gin Val Lys Leu Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly 
15 10 15 

Ala Ser Val Lys Val Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn 
20 25 30 

Phe Val Leu His Trp Ala Arg Gin Ala Pro Gly His Arg Pro Glu Trp 
35 40 45 

Met Gly Trp lie Asn Pro Ala Asn Gly Val Thr Glu He Ser Pro Lys 
50 55 60 

Phe Gin Asp Arg Val Ser Leu Thr Gly Asp Thr Ser Ala Ser Thr Val 
65 70 75 80 

Tyr Leu Glu Leu Arg Asn Leu Arg Phe Ala Asp Thr Ala Val Tyr Tyr 
85 90 95 

Cys Ala Arg Val Gly Glu Trp Thr Trp Asp Asp Phe Pro Gin Asp Asn 
100 105 110 



Tyr Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val 
115 120 
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(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

Gin Val Lys Leu Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly 
15 10 15 

Ala Ser Val Lys Leu Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn 
20 25 30 

Phe Val Leu His Trp Ala Arg Gin Ala Pro Gly His Arg Pro Glu Trp 
35 40 45 

Met Gly Trp lie Asn Pro Ala Asn Gly Val Thr Glu lie Ser Pro Lys 
50 55 60 

Phe Gin Asp Arg Val Ser Leu Thr Gly Asp Thr Ser Ala Ser Thr Val 
65 70 75 80 

Tyr Leu Glu Leu Arg Asn Leu Arg Phe Ala Asp Thr Ala Val Tyr Tyr 
85 90 95 

Cys Ala Arg Val Gly Glu Trp Thr Trp Asp Asp Ser Pro Gin Asp Asn 
100 105 110 

Tyr Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val Thr 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 



Gin Val Lys Leu Leu Glu Gin Ser Gly Thr Glu Val Lys Lys Pro Gly 
15 10 15 
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Ala Ser Val Lys lie Ser Cys Lys Ala Ser Gly Tyr Arg Phe Thr Asn 
20 25 30 

Phe Pro Leu His Trp Val Arg Gin Ala Pro Gly Gin Arg Pro Glu Trp 
35 40 45 

Met Gly Trp lie Lys lie Val Asn Gly Glu Lys Lys Tyr Ser Gin Lys 
50 55 60 . 

Phe Val Asp Arg Val Thr Phe Thr Gly Asp Thr Ser Ala Asn Thr Ala 
65 70 75 80 

Tyr Met Glu Val Arg Gly Leu Arg Ser Ala Asp Thr Ala Thr Tyr Tyr 
85 90 95 

Cys Ala Arg Val Gly Glu Trp Thr Trp Asp Met Asp Pro Gin Ala Asn 
100 105 110 

Tyr Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val Thr 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 130: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Gin Val Lys Leu Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly 
15 10 15 

Ala Ser Val Lys Val Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn 
20 25 30 

Phe Val He His Trp Val Arg Gin Ala Pro Gly Gin Arg Phe Glu Trp 
35 40 45 

Met Gly Trp He Asn Pro Tyr Asn Gly Asn Lys Glu Phe Ser Ala Lys 
50 55 60 

Phe Arg Asp Arg Val Thr Phe Thr Ala Asp Thr Asp Ala Asn Thr Ala 
65 70 75 80 



Tyr Met Glu Leu Arg Ser Leu Arg Ser Ala Asp Thr Ala He Tyr Tyr 
85 90 95 
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Cys Ala Arg Val Gly Pro Tyr Thr Trp Asp Asp Ser Pro Gin Asp Asn 
100 105 110 

Tyr Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val 
115 120 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Gin Val Lys Leu Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly 
15 10 15 

Ala Ser Val Lys Val Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn 
20 25 30 

Phe Val Leu His Trp Ala Arg Gin Ala Pro Thr Gin Asp Leu Glu Trp 
35 40 45 

Met Gly Trp lie Asn Pro Ala Asn Gly Val Lys Glu lie Ser Pro Lys 
50 55 60 

Phe Gin Asp Arg Val Ser Leu Thr Gly Asp Thr Ser Ala Ser Thr Val 
65 70 75 80 

Tyr Leu Glu Leu Arg Ser Leu Arg Phe Ala Asp Thr Ala Val Tyr Tyr 
85 90 95 

Cys Ala Arg Val Gly Glu Trp Thr Trp Asp Asp Ser Pro Gin Asp Asn 
100 105 110 

Tyr Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val 
115 120 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPWiPPSHriD NO: 132: 

Gin Val Lys Leu Leu Glu Gin Ser Gly Ala Glu Val Lys Lys Pro Gly 
15 10 15 

Ala Ser Val Lys Val Ser Cys Gin Ala Ser Gly Tyr Arg Phe Ser Asn 
20 25 30 

Phe Val Leu His Trp Ala Arg Gin Ala Pro Gly His Arg Pro Glu Trp 
35 40 45 

Met Gly Trp He Asn Pro Ala Asn Gly Val Thr Glu He Pro Pro Lys 
50 55 60 

Phe Gin Asp Arg Val Ser Leu Thr Arg Asp Thr Ser Ala Gly Thr Val 
65 70 75 80 

Tyr Leu Glu Leu Thr Asn Leu Arg Phe Ala Asp Thr Ala Val Tyr Tyr 
85 90 95 

Cys Ala Arg Val Gly Glu Trp Thr Trp Asp Asp Ser Pro Gin Asp Asn 
100 105 110 

Tyr Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val 
115 120 

(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 
TCGAGGGTCG GTCGGTCTCT AGACGGTCGG TCGGTCA 
(2) INFORMATION FOR SEQ ID NO: 134: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 
CTAGTGACCG ACCGACCGTC TAGAGACCGA CCGACCC 
(2) INFORMATION FOR SEQ ID NO:135: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION; SEQ ID NO: 135: 
CGGTCGGTCG GTCCTCGAGG GTCGGTCGGT CT 
(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 
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CTAGAGACCG ACCGACCCTC GAGGACCGAC CGACCGAGCT 
(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 
CAAGGAGACA GGATCCATGA AATAC 
(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 
AGGGCGAATT GGATCCCGGG CCCCC 
(2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 
CTAGTCATCA TCATCATCAT TAAGCTAGC 29 
(2) INFORMATION FOR SEQ ID NO: 140: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 
CTAGGCTAGC TTAATGATGA TGATGATGA 29 
(2) INFORMATION FOR SEQ ID NO: 141: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 



(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 
. (B) LOCATION: 1 
(D) OTHER INFORMATION: /label- J 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-site 

(B) LOCATION: 13 

(D) OTHER INFORMATION: /label- ZC 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

Ser lie Ser lie Cly Pro Gly Arg Ala Phe Tyr Thr Gly 
15 10 

(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 

Leu Leu Glu Ser Gly Pro Gly Leu Val Lys Pro Ser Glu Thr Leu Ser 
15 10 15 

Leu Thr Cys Thr Val Ser Gly Gly Ser Leu Ser Ser Phe Asp Trp Asn 
20 25 30 

Trp He Arg Gin Pro Ala Gly Lys Gly Leu Glu Trp He Gly Arg He 
35 40 45 

Tyr Pro Ser Gly Asn Thr His Tyr Asn Pro Ser Leu Arg Ser Arg Val 
50 55 60 

Thr Met Ser Arg Asp Thr Ser Lys Asn Gin Phe Ser Val Lys Leu Thr 
65 70 75 80 

Ser Val Thr Ala Ala Asp Thr Ala Leu Tyr Tyr Cys Ala Arg Glu Asn 
85 90 95 

Thr Gly Arg Thr He Glu Glu He Gly Asn Phe Phe Asp He Trp Gly 
100 105 110 

Gin Gly Thr Leu Val Thr Val Ser Ser Ala Ser Thr Lys Gly 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 122 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 

Leu Leu Lys Ser Gly Gly Gly Leu Val Lys Pro Gly Gly Ser Leu Arg 
15 10 15 

Leu Ser Cys Val He Ser Ala Phe Ser Phe Ser Gly Tyr Asn He Asn 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ser Ser He 
35 40 45 

Ser Met Ser Thr Gly Ser Leu Ser Tyr Ala Asp Ser Met Lys Gly Arg 
50 55 60 

Phe Thr He Ser Arg Asp Asn Ala Lys Asn Ser Val Tyr Leu Glu Met 
65 70 75 80 

Ser Ser Leu Thr Ala Glu Asp Thr Ala Met Tyr Tyr Cys Ala Ala Arg 
85 90 95 

Thr Pro Leu Val Gly Arg Ala Leu Asp He Trp Gly Gin Gly Thr Val 
100 105 110 

Val Thr Val Ser Ser Ala Ser Thr Lys Gly 
115 120 

(2) INFORMATION FOR SEQ ID NO: 144: 

(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Leu Leu Glu Ser Gly Gly Gly Leu Val Lys Pro Gly Gly Ser Leu Arg 
15 10 15 

Leu Ser Cys Ser Ala Ser Gly Phe Thr Phe Ser Ser Tyr Gly Met Asn 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Pro Glu Trp Val Ala Tyr He 
35 40 45 

Ser Ser Ser Arg Lys Tyr Thr Glu Tyr Ala Asp Ser Val Lys Gly Arg 
50 55 60 

Phe Thr He Ser Arg Glu Asn Ala Lys Tyr Ser Val Phe Leu Gin Leu 
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65 70 75 80 

Asp Ser Leu Thr Ala Glu Asp Thr Ala lie Tyr Tyr Cys Ala Arg Gly 
85 90 95 

Arg Asp Phe Tyr Ser Gly Phe Gly Arg Arg Asp Asp Phe His Leu His 
100 105 110 

Tyr Met Asp Val Trp Gly Lys Gly Thr Thr Val Thr Val Ser Ser Ala 
115 120 125 

Ser Thr Lys Gly 
130 

(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Leu Leu Glu Gin Ser Gly Gly Gly Leu Val Gin Pro Gly Gly Ser Leu 
15 10 15 

Arg lie Ser Cys Val Ala Ser Gly Asp lie Phe Tyr Ser Tyr Ala Met 
20 25 30 

Ser Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Glu Trp Val Ala Ser 
35 40 45 

lie Ser Gly Thr Gly Gly Ser Asn Tyr Tyr Ala Asp Ser Val Lys Gly 
50 55 60 

Arg Phe Thr lie Ser Arg Asp Asn Ser Lys Ser Thr Leu Tyr Leu Gin 
65 70 75 80 

Met Asn Ser Leu Arg Ala Glu Asp Thr Ala Leu Tyr Tyr Cys Ala Arg 
85 90 95 

Asp Arg Gly Pro Arg He Gly He Arg Gly Trp Phe Asp Ser Trp Gly 
100 105 110 

Gin Gly Thr Leu Val Thr Val Ser Ser Ala Ser Thr Lys Gly 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 146: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 124 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146: 

Leu Leu Glu Ser Gly Gly Gly Leu Val Gin Pro Gly Gly Ser Leu Arg 
15 10 15 

Leu Ser Cys Ala Ala Ser Gly Phe Leu Tyr Ser Ser Phe Ala Met Ser 
20 25 30 

Trp Val Arg Gin Ala Pro Gly Lys Gly Leu Ala Trp Val Ser Thr He 
35 40 45 

Ser Ala Ser Gly Gly Ser Thr Lys Tyr Ala Asp Ser Val Lys Gly Arg 
50 55 60 

Phe He He Ser Arg Asp Asn Ser Lys Asn Thr He Tyr Leu Gin Met 
65 70 75 80 

Asp Ser Leu Arg Ala Glu Asp Thr Ala Val Tyr Tyr Cys Ala Lys Asn 
85 90 95 

Phe Arg Ala Phe Ala Arg Asp Pro Trp Gly Asp Trp Gly Gin Gly Thr 
100 105 110 

Leu Val Thr Val Ser Ser Ala Ser Ala Ser Thr Lys 
115 120 

(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 109 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

Met Ala Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly 
15 10 15 

Glu Arg Val He Val Ser Cys Arg Ala Ser Gin Ser Val Ser Ser Asn 
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20 



25 



30 



Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu 
35 40 45 

lie Tyr Gly Ala Ser Asn Arg Ala Thr Gly lie Pro Asp Arg Phe Ser 
50 55 60 

Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Arg Leu Glu 
65 70 75 80 

Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gin Gin Tyr Gly Ser Ser Gly 
85 90 95 

Thr Phe Gly Gin Gly Thr Lys Val Glu lie Lys Arg Thr 



(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 112 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

Met Ala Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly 
1 5 10 15 

Glu Arg Ala Thr Phe Ser Cys Arg Ser Ser His Ser lie His Thr Arg 
20 25 30 

Arg Val Ala Trp Tyr Gin His Lys Pro Gly Gin Ala Pro Arg Leu Val 
35 40 45 

lie His Gly Val Ser Asn Arg Ala Ser Gly lie Ser Asp Arg Phe Ser 
50 55 60 

Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Thr Arg Val Glu 
65 70 75 80 

Pro Glu Asp Phe Ala Leu Tyr Tyr Cys Gin Val Tyr Gly Ala Ser Ser 
85 90 95 

Tyr Thr Phe Gly Gin Gly Thr Lys Leu Glu Arg Lys Arg Thr Val Val 



100 



105 



100 



105 



110 
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(2) INFORMATION FOR SEQ ID NO: 149: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

Met Ala Glu Leu Thr Gin Ser Pro Gly Thr Leu Ser Leu Ser Pro Gly 
15 10 15 

Glu Arg Ala Thr Leu Ser Cys Arg Ala Ser Gin Ser Val Ser Asn Gly 
20 25 30 

Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Leu 
35 40 45 

lie Tyr Gly Ala Ser Thr Arg Ala Thr Asp He Pro Asp Arg Phe Ser 
50 55 60 

Gly Ser Gly Ser Gly Ala Asp Phe Thr Leu Ala He Ser Arg Leu Glu 
65 70 75 80 

Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gin Gin Tyr Ala Gly Ser His 
85 90 95 

Thr Phe Gly Gin Gly Thr Lys Leu Glu He Lys Arg Thr Val Ala 
100 105 110 

(2) INFORMATION FOR SEQ ID NO: 150: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Met Ala Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly 
1 5 10 15 

Asp Arg Val Thr He Thr Cys Arg Pro Ser Gin Gly He Gly Arg Phe 
20 25 30 
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Phe Asn Trp Tyr Gin Gin Lys Pro Gly Lys Ala Pro Asn Leu Leu lie 
35 40 45 

Tyr Ala Ala Asp lie Leu Gin Ser Gly Val Pro Ser Arg Phe Ser Gly 
50 55 60 

Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr He Ser Ser Leu Gin Pro 
65 70 75 80 

Glu Asp Phe Ala Thr Tyr Tyr Cys Gin Gin Ser Tyr Ser Thr Pro Tyr 
85 90 95 

Thr Phe Gly Gin Gly Thr Arg Leu Asp He Lys Arg Thr Val Ala 
100 105 110 

(2) INFORMATION FOR SEQ ID N0:151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 112 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(it) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

Met Ala Glu Leu Thr Gin Ser Pro Ser Ser Leu Ser Ala Ser Val Gly 
1 5 10 15 

Asp Arg Val Thr He Thr Cys Arg Ala Ser Gin Gly Val Ser Ser Ser 
20 25 30 

Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ala Pro Arg Leu Val 
35 40 45 

He Phe Gly Ala Tyr Ser Arg Ala Thr Gly He Pro Asp Arg Phe Ser 
50 55 60 

Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr He Ser Arg Leu Glu 
65 70 75 80 

Pro Glu Asp Phe Ala Val Tyr Tyr Cys Gin Gin Tyr Gly Ser Ser Pro 
85 90 95 

He Thr Phe Gly Pro Gly Thr Lys Val Asp He Lys Arg Thr Val Ala 
100 105 110 



(2) INFORMATION FOR SEQ ID NO: 152: 
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(1) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 729 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 9. .715 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

AGCTTACC ATG GGT GTG CCC ACT CAG GTC CTG GGG TTG CTG CTG CTG TGG 50 
Met Gly Val Pro Thr Gin Val Leu Gly Leu Leu Leu Leu Trp 
15 10 

CTT ACA GAT GCC AGA TGT GAG ATC GTT CTC ACG CAG TCT CCA GGC ACC 98 
Leu Thr Asp Ala Arg Cys Glu He Val Leu Thr Gin Ser Pro Gly Thr 
15 20 25 30 

CTG TCT CTG TCT CCA GGG GAA AGA GCC ACC TTC TCC TGT AGG TCC AGT 146 
Leu Ser Leu Ser Pro Gly Glu Arg Ala Thr Phe Ser Cys Arg Ser Ser 
35 40 45 

CAC AGC ATT CGC AGC GGC CGC GTA GCC TGG TAC CAG CAC AAA CCT GGC 194 
His Ser He Arg Ser Arg Arg Val Ala Trp Tyr Gin His Lys Pro Gly 
50 55 60 

CAG GCT CCA AGG CTG GTC ATA CAT GGT GTT TCC AAT AGG GCC TCT GGC 242 
Gin Ala Pro Arg Leu Val He His Gly Val Ser Asn Arg Ala Ser Gly 
65 70 75 

ATC TCA GAC AGG TTC AGC GGC AGT GGG TCT GGG ACA GAC TTC ACT CTC 290 
He Ser Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu 
80 85 90 

ACC ATC ACC AGA GTG GAG CCT GAA GAC TTT GCA CTG TAC TAC TGT CAG 338 
Thr He Thr Arg Val Glu Pro Glu Asp Phe Ala Leu Tyr Tyr Cys Gin 
95 100 105 110 

GTC TAT GGT GCC TCC TCG TAC ACT TTT GGC CAG GGG ACC AAA CTG GAG 386 
Val Tyr Gly Ala Ser Ser Tyr Thr Phe Gly Gin Gly Thr Lys Leu Glu 
115 120 125 

AGG AAA CGA ACT GTG CCT GCA CCA TCT GTC TTC ATC TTC CCG CCA TCT 434 
Arg Lys Arg Thr Val Pro Ala Pro Ser Val Phe He Phe Pro Pro Ser 
130 135 140 
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GAT GAG CAG TTG AAA TCT GGG ACT GCC TCT GTT GTG TGC CTG CTG AAT 482 
Asp Glu Gin Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn 
145 150 155 

AAC TTC TAT CCC AGA GAG GCC AAA GTA CAG TGG AAG GTG GAT AAC GCC 530 
Asn Phe Tyr Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala 
160 165 170 

CTC CAA TCG GGT AAC TCC CAG GAG AGT GTC ACA GAG CAG GAC AGC AAG 578 
Leu Gin Ser Gly Asn Ser Gin Glu Ser Val Thr Glu Gin Asp Ser Lys 
175 180 185 190 

GAC AGC ACC TAC AGC CTC AGC AGC ACC CTG ACG CTG AGC AAA GCA GAC 626 
Asp Ser Thr Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp 
195 200 205 

TAC GAG AAA CAC AAA GTC TAC GCC TGC GAA GTC ACC CAT CAG GGC CTG 674 
Tyr Glu Lys His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu 
210 215 220 

AGT TCG CCC GTC ACA AAG AGC TTC AAC AGG GGA GAG TGT TA ATTCTAGAGA 725 
Ser Ser Pro Val Thr Lys Ser Phe Asn Arg Gly Glu Cys 
225 230 235 

ATTC 729 



(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 235 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Met Gly Val Pro Thr Gin Val Leu Gly Leu Leu Leu Leu Trp Leu Thr 
15 10 15 

Asp Ala Arg Cys Glu lie Val Leu Thr Gin Ser Pro Gly Thr Leu Ser 
20 25 30 

Leu Ser Pro Gly Glu Arg Ala Thr Phe Ser Cys Arg Ser Ser His Ser 
35 40 45 

lie Arg Ser Arg Arg Val Ala Trp Tyr Gin His Lys Pro Gly Gin Ala 
50 55 60 

Pro Arg Leu Val He His Gly Val Ser Asn Arg Ala Ser Gly He Ser 
65 70 75 80 
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Asp Arg Phe Ser Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie 
85 90 95 

Thr Arg Val Glu Pro Glu Asp Phe Ala Leu Tyr Tyr Cys Gin Val Tyr 
100 105 110 

Gly Ala Ser Ser Tyr Thr Phe Gly Gin Gly Thr Lys Leu Glu Arg Lys 
115 120 125 

Arg Thr Val Pro Ala Pro Ser Val Phe lie Phe Pro Pro Ser Asp Glu 
130 135 140 

Gin Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe 
145 150 155 160 

Tyr Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin 
165 170 175 

Ser Gly Asn Ser Gin Glu Ser Val Thr Glu Gin Asp Ser Lys Asp Ser 
180 185 190 

Thr Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu 
195 200 205 

Lys His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser 
210 215 220 

Pro Val Thr Lys Ser Phe Asn Arg Gly Glu Cys 
225 230 235 

(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 15. .452 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 



AATTCGCCGC CACC ATG GAA TGG AGC TGG GTC TTT CTC TTC TTC CTG TCA 
Met Glu Trp Ser Trp Val Phe Leu Phe Phe Leu Ser 
15 10 
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SHINE-DALGARNO MET 

GGCCGCAAATTCTATTTCAAGGAGACAGTCATAATG 
CGTTTAAGATAAAGTTCCTCTGTCAGTATTAC 



LEADER SEQUENCE 

'AAATACCTATTGCCTACGGCAGCCGCT 
TTTATGGATAACGGATGCCGTCGGCGA 



LEADER SEQUENCE 

GGATTGTTATTACTCGCTGCCCAACCAG' 
CCTAACAATAATGAG CGACG GGTTGGT C 



LINKER 



LINKER 



NCOI 



BACKBONE 



XHOI 



SPEI 



CCATGGCCCAGGTGAAACTGCTCGAGATTTCTAGACTAGT 
GGTACCGGGTCCACTTTGACGAGCTCTAAAGATCTGATCA 



STOP LINKER 

Tyr Pr oTy r Asp ValPr oAspTyr Al a S er" 

TACCCGTACGACGTTCCGGACTACGGTTCTTAATAGAATTCG' 
ATGGGCATGCTGCAAGGCCTGATGCCAAGAATTATCTTAAGCAGCT 



FIG. 
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AGCTTACCAT GGGTGTGCCC ACTCAGGTCC TGGGGTTGCT GCTGCTGTGG CTTACAGATG 
TCGAATGGTA CCCACACGGG TGAGTCCAGG ACCCCAACGA CGACGACACC GAATGTCTAC 
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****** 

AGCACAAACC TGGCCAGGCT CCAAGGCTGG TCATACATGG TGTTTCCAAT AGGGCCTCTG 
TCGTGTTTGG ACCGGTCCGA GGTTCCGACC AGTATGTACC ACAAAGGTTA TCCCGGAGAC 
QHKP GQA PRL VIHG VSN RAS> 

245 250 255 260 265 270 275 280 285 290 295 300 
****** 

GCATCTCAGA CAGGTTCAGC GGCAGTGGGT CTGGGACAGA CTTCACTCTC ACCATCACCA 
CGTAGAGTCT GTCCAAGTCG CCGTCACCCA GACCCTGTCT GAAGTGAGAG TGGTAGTGGT 
GISD RFS GSG SGTD FTL TIT> 

305 310 315 320 325 330 335 340 345 350 355 360 
****** 

GAGTGGAGCC TGAAGACTTT GCACTGTACT ACTGTCAGGT CTATGGTGCC TCCTCGTACA 
CTCACCTCGG ACTTCTGAAA CGTGACATGA TGACAGTCCA GATACCACGG AGGAGCATGT 
RVEP EDF ALY YCQV YGA SSY> 

365 370 375 380 385 390 395 400 405 410 415 420 
****** 

CTTTTGGCCA GGGGACCAAA CTGGAGAGGA AACGAACTGT GCCTGCACCA TCTGTCTTCA 
GAAAACCGGT CCCCTGGTTT GACCTCTCCT TTGCTTGACA CGGACGTGGT AGACAGAAGT 
TFGQ GTK LER KRTV PAP SVF> 

425 430 435 440 445 450 455 460 465 470 475 480 
****** 

TCTTCCCGCC ATCTGATGAG CAGTTGAAAT CTGGGACTGC CTCTGTTGTG TGCCTGCTGA 
AGAAGGGCGG TAGACTACTC GTCAACTTTA GACCCTGACG GAGACAACAC ACGGACGACT 
IFPP SDE QLK SGTA SVV CLL> 

485 490 495 500 505 510 515 520 525 530 535 540 
****** 

ATAACTTCTA TCCCAGAGAG GCCAAAGTAC AGTGGAAGGT GGATAACGCC CTCCAATCGG 
TATTGAAGAT AGGGTCTCTC CGGTTTCATG TCACCTTCCA CCTATTGCGG GAGGTTAGCC 
NNFY PRE A K V QWKV DNA LQS> 

545 550 555 560 565 570 575 580 585 590 595 600 
****** 

GTAACTCCCA GGAGAGTGTC ACAGAGCAGG ACAGCAAGGA CAGCACCTAC AGCCTCAGCA 
CATTGAGGGT CCTCTCACAG TGTCTCGTCC TGTCGTTCCT GTCGTGGATG TCGGAGTCGT 
GNSQ ESV TEQ DSKD STY SLS> 
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605 610 615 620 625 630 635 640 645 650 655 660 
* * * * * * 

GCACCCTGAC GCT6AGCAAA GCAGACTACG AGAAACACAA AGTCTACGCC TGCGAAGTCA 
CGTGGGACTG CGACTCGTTT CGTCTGATGC TCTTTGTGTT TCAGATGCGG ACGCTTCAGT 
STLT LSK A D Y EKHK V Y A CEV> 

665 670 675 680 685 690 695 700 705 710 715 720 
****** 

CCCATCAGGG CCTGAGTTCG CCCGTCACAA AGAGCTTCAA CAGGGGAGAG TGTTAATTCT 
GGGTAGTCCC GGACTCAAGC GGGCAGTGTT TCTCGAAGTT GTCCCCTCTC ACAATTAAGA 
THQG LSS PVT KSFN RGE C*> 

725 
AGAGAATTC 
TCTCTTAAG 
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5 10 15 20 25 30 35 40 45 50 55 60 
****** 

AATTCGCCGC CACCATGGAA TGGAGCTGGG TCTTTCTCTT CTTCCTGTCA GTAACTACAG 
TTAAGCGGCG GTGGTACCTT ACCTCGACCC AGAAAGAGAA GAAGGACAGT CATTGATGTC 
ME WSW VFLF FLS VTT> 

65 70 75 80 85 90 95 100 105 110 115 120 
****** 

GTGTCCACTC CCAGGTTCAG CTGGTTCAGT CCGGGGCTGA GGTGAAGAAG CCTGGGGCCT 
CACAGGTGAG GGTCCAAGTC GACCAAGTCA GGCCCCGACT CCACTTCTTC GGACCCCGGA 
GVHS QVQ LVQ S G A £ VKK PGA> 

125 130 135 140 145 150 155 160 165 170 175 180 
****** 

CAGTGAAGGT TTCTTGTCAG GCTTCTGGAT ACAGATTCAG TAACTTTGTT ATTCATTGGG 
GTCACTTCCA AAGAACAGTC CGAAGACCTA TGTCTAAGTC ATTGAAACAA TAAGTAACCC 
SVKV SCQ ASG YRFS NFV IHW> 

185 190 195 200 205 210 215 220 225 230 235 240 

* * * * * * 

TGCGCCAGGC CCCCGGACAG AGGTTTGAGT GGATGGGATG GATCAATCCT TACAACGGAA 
ACGCGGTCCG GGGGCCTGTC TCCAAACTCA CCTACCCTAC CTAGTTAGGA ATGTTGCCTT 
VRQA PGQ RFE WMGW INP YNG> 

245 250 255 260 265 270 275 280 285 290 295 300 

* * * * * * 

ACAAAGAATT TTCAGCGAAG TTCCAGGACA GAGTCACCTT TACCGCGGAC ACATCCGCGA 
TGTTTCTTAA AAGTCGCTTC AAGGTCCTGT CTCAGTGGAA ATGGCGCCTG TGTAGGCGCT 
NKEF SAK FQD RVTF TAD TSA> 

305 310 315 320 325 330 335 340 345 350 355 360 
****** 

ACACAGCCTA CATGGAGTTG AGGAGCCTCA GGTCTGCAGA CACGGCTGTT TATTATTGTG 
TGTGTCGGAT GTACCTCAAC TCCTCGGAGT CCAGACGTCT GTGCCGACAA ATAATAACAC 
NTAY MEL RSL RSAD TAV YYO 

365 370 375 380 385 390 395 400 405 410 415 420 
****** 

CGAGAGTGGG GCCATATAGT TGGGATGATT CTCCCCAGGA CAATTATTAT ATGGACGTCT 
GCTCTCACCC CGGTATATCA ACCCTACTAA GAGGGGTCCT GTTAATAATA TACCTGCAGA 
ARVG PYS WDD SPQD NYY MDV> 

425 430 435 440 445 450 455 460 465 470 475 480 
****** 

GGGGCAAAGG AACCACGGTC ATCGTGAGCT CAGCTTCCAC CAAGGGCCCA TCGGTCTTCC 
CCCCGTTTCC TTGGTGCCAG TAGCACTCGA GTCGAAGGTG GTTCCCGGGT AGCCAGAAGG 
V G K G TTV IVS S> 

485 490 495 500 505 510 515 520 525 530 535 540 
****** 

CCCTGGCACC CTCCTCCAAG AGCACCTCTG GGGGCACAGC GGCCCTGGGC TGCCTGGTCA 
GGGACCGTGG GAGGAGGTTC TCGTGGAGAC CCCCGTGTCG CCGGGACCCG ACGGACCAGT 

545 550 555 560 565 570 575 580 585 590 595 600 
****** 

AGGACTACTT CCCCGAACCG GTGACGGTGT CGTGGAACTC AGGCGCCCTG ACCAGCGGCG 
TCCTGATGAA GGGGCTTGGC CACTGCCACA GCACCTTGAG TCCGCGGGAC TGGTCGCCGC 
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605 610 615 620 



33/5 

625 630 635 640 



645 650 655 660 



TGCACACCTT CCCGGCTGTC CTACAGTCCT CAGGACTCTA CTCCCTCAGC AGCGTGGTGA 
ACGTGTGGAA GGGCCGACAG GATGTCAGGA GTCCTGAGAT GAGGGAGTCG TCGCACCACT 

665 670 675 680 685 690 695 700 705 710 715 720 
****** 

CCGTGCCCTC CAGCAGCTTG GGCACCCAGA CCTACATCTG CAACGTGAAT CACAAGCCCA 
GGCACGGGAG GTCGTCGAAC CCGTGGGTCT GGATGTAGAC GTTGCACTTA GTGTTCGGGT 

725 730 735 740 745 750 755 760 765 770 775 780 
****** 

GCAACACCAA GGTGGACAAG AAAGTTGGTG AGAGGCCAGC ACAGGGAGGG AGGGTGTCTG 
CGTTGTGGTT CCACCTGTTC TTTCAACCAC TCTCCGGTCG TGTCCCTCCC TCCCACAGAC 



785 



790 
* 



795 



800 
* 



805 



810 
* 



815 



820 
* 



825 



830 
* 



835 



840 
* 



CTGGAAGCCA GGCTCAGCGC TCCTGCCTGG ACGCATCCCG GCTATGCAGC CCCAGTCCAG 
GACCTTCGGT CCGAGTCGCG AGGACGGACC TGCGTAGGGC CGATACGTCG GGGTCAGGTC 

845 850 855 860 865 870 875 880 885 890 895 900 
****** 

GGCAGCAAGG CAGGCCCCGT CTGCCTCTTC ACCCGGAGGC CTCTGCCCGC CCCACTCATG 
CCGTCGTTCC GTCCGGGGCA GACGGAGAAG TGGGCCTCCG GAGACGGGCG GGGTGAGTAC 



905 910 915 920 -925 930 935 940 945 950 955 960 
****** 

CTCAGGGAGA GGGTCTTCTG GCTTTTTCCC CAGGCTCTGG GCAGGCACAG GCTAGGTGCC 
GAGTCCCTCT CCCAGAAGAC CGAAAAAGGG GTCCGAGACC CGTCCGTGTC CGATCCACGG 



965 970 975 980 985 990 995 1000 1005 1010 1015 1020 
****** 

CCTAACCCAG GCCCTGCACA CAAAGGGGCA GGTGCTGGGC TCAGACCTGC CAAGAGCCAT 
GGATTGGGTC CGGGACGTGT GTTTCCCCGT CCACGACCCG AGTCTGGACG GTTCTCGGTA 

1025 1030 1035 1040 1045 1050 1055 1060 1065 1070 1075 1080 
****** 

ATCCGGGAGG ACCCTGCCCC TGACCTAAGC CCACCCCAAA GGCCAAACTC TCCACTCCCT 
TAGGCCCTCC TGGGACGGGG ACTGGATTCG GGTGGGGTTT CCGGTTTGAG AGGTGAGGGA 



1085 



1120 
* 



1125 



1130 
* 



1135 



1140 
* 



1090 1095 1100 1105 1110 1115 
* * * 

CAGCTCGGAC ACCTTCTCTC CTCCCAGATT CGAGTAACTC CCAATCTTCT CTCTGCAGAG 
GTCGAGCCTG TGGAAGAGAG GAGGGTCTAA GCTCATTGAG GGTTAGAAGA GAGACGTCTC 



1145 1150 
* 



1155 1160 
* 



1165 1170 
* 



1175 1180 
* 



1185 1190 
* 



1195 1200 
* 



CCCAAATCTT GTGACAAAAC TCACACATGC CCACCGTGCC CAGGTAAGCC AGCCCAGGCC 
GGGTTTAGAA CACTGTTTTG AGTGTGTACG GGTGGCACGG GTCCATTCGG TCGGGTCCGG 



1205 1210 
* 



1215 1220 
* 



1225 1230 
* 



1235 1240 
* 



1245 1250 
* 



1255 1260 
* 



TCGCCCTCCA GCTCAAGGCG GGACAGGTGC CCTAGAGTAG CCTGCATCCA GGGACAGGCC 
AGCGGGAGGT CGAGTTCCGC CCTGTCCACG GGATCTCATC GGACGTAGGT CCCTGTCCGG 

1265 1270 1275 1280 1285 1290 1295 1300 1305 1310 1315 1320 
* * * * * * 

CCAGCCGGGT GCTGACACGT CCACCTCCAT CTCTCCCTCA GCACCTGAGG CCGCGGGAGG 
GGTCGGCCCA CGACTGTGCA GGTGGAGGTA GAGAGGGAGT CGTGGACTCC GGCGCCCTCC 
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1325 1330 1335 1340 1345 1350 1355 1360 1365 1370 1375 1380 
****** 

ACCATCAGTC TTCCTCTTCC CCCCAAAACC CAAGGACACC CTCATGATCT CCCGGACCCC 
TGGTAGTCAG AAGGAGAAGG GGGGTTTTGG GTTCCTGTGG GAGTACTAGA GGGCCTGGGG 

1385 1390 1395 1400 1405 1410 1415 1420 1425 1430 1435 1440 
****** 

TGAGGTCACA TGCGTGGTGG TGGACGTGAG CCACGAAGAC CCTGAGGTCA AGTTCAACTG 
ACTCCAGTGT ACGCACCACC ACCTGCACTC GGTGCTTCTG GGACTCCAGT TCAAGTTGAC 

1445 1450 1455 1460 1465 1470 1475 1480 1485 1490 1495 1500 
****** 

GTACGTGGAC GGCGTGGAGG TGCATAATGC CAAGACAAAG CCGCGGGAGG AGCAGTACAA 
CATGCACCTG CCGCACCTCC ACGTATTACG GTTCTGTTTC GGCGCCCTCC TCGTCATGTT 

1505 1510 1515 1520 1525 1530 1535 1500 1545 1550 1555 1560 

* ***** 

CAGCACGTAC CGTGTGGTCA GCGTCCTCAC CGTCCTGCAC CAGGACTGGC TGAATGGCAA 
GTCGTGCATG GCACACCAGT CGCAGGAGTG GCAGGACGTG GTCCTGACCG ACTTACCGTT 

1565 1570 1575 1580 1585 1590 1595 1600 1605 1610 1615 1620 
****** 

GGAGTACAAG TGCAAGGTCT CCAACAAAGC CCTCCCAGCC CCCATCGAGA AAACCATCTC 
CCTCATGTTC ACGTTCCAGA GGTTGTTTCG GGAGGGTCGG GGGTAGCTCT TTTGGTAGAG 

1625 1630 1635 1640 1645 1650 1655 1660 1665 1670 1675 1680 
****** 

CAAAGCCAAA GGTGGGACCC GTGGGGTGCG AGGGCCACAT GGACAGAGGC CGGCTCGGCC 
GTTTCGGTTT CCACCCTGGG CACCCCACGC TCCCGGTGTA CCTGTCTCCG GCCGAGCCGG 

1685 1690 1695 1700 1705 1710 1715 1720 1725 1730 1735 1740 
****** 

CACCCTCTGC CCTGAGAGTG ACCGCTGTAC CAACCTCTGT CCCTACAGGG CAGCCCCGAG 
GTGGGAGACG GGACTCTCAC TGGCGACATG GTTGGAGACA GGGATGTCCC GTCGGGGCTC 

1745 1750 1755 1760 1765 1770 1775 1780 1785 1790 1795 1800 

* * * * * * 

AACCACAGGT GTACACCCTG CCCCCATCCC GGGATGAGCT GACCAAGAAC CAGGTCAGCC 
TTGGTGTCCA CATGTGGGAC GGGGGTAGGG CCCTACTCGA CTGGTTCTTG GTCCAGTCGG 

1805 1810 1815 1820 1825 1830 1835 1840 1845 1850 1855 1860 
****** 

TGACCTGCCT GGTCAAAGGC TTCTATCCCA GCGACATCGC CGTGGAGTGG GAGAGCAATG 
ACTGGACGGA CCAGTTTCCG AAGATAGGGT CGCTGTAGCG GCACCTCACC CTCTCGTTAC 

1865 1870 1875 1880 1885 1890 1895 1900 1905 1910 1915 1920 
****** 

GGCAGCCGGA GAACAACTAC AAGACCACGC CTCCCGTGCT GGACTCCGAC GGCTCCTTCT 
CCGTCGGCCT CTTGTTGATG TTCTGGTGCG GAGGGCACGA CCTGAGGCTG CCGAGGAAGA 

1925 1930 1935 1940 1945 1950 1955 1960 1965 1970 1975 1980 
****** 

TCCTCTACAG CAAGCTCACC GTGGACAAGA GCAGGTGGCA GCAGGGGAAC GTCTTCTCAT 
AGGAGATGTC GTTCGAGTGG CACCTGTTCT CGTCCACCGT CGTCCCCTTG CAGAAGAGTA 

1985 1990 1995 2000 2005 2010 2015 2020 2025 2030 2035 2040 
****** 

GCTCCGTGAT GCATGAGGCT CTGCACAACC ACTACACGCA GAAGAGCCTC TCCCTGTCTC 
CGAGGCACTA CGTACTCCGA GACGTGTTGG TGATGTGCGT CTTCTCGGAG AGGGACAGAG 
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2045 2050 2055 2060 2065 2070 2075 2080 2085 2090 2095 2100 
****** 

CGGGTAAATG AGTGCGACGG CCGGCAAGCC CCCGCTCCCC GGGCTCTCGC GGTCGCACGA 
GCCCATTTAC TCACGCTGCC GGCCGTTCGG GGGCGAGGGG CCCGAGAGCG CCAGCGTGCT 

2105 2110 2115 2120 2125 2130 2135 2140 2145 2150 2155 2160 
****** 

GGATGCTTGG CACGTACCCC CTGTACATAC TTCCCGGGCG CCCAGCATGG AAATAAAGCA 
CCTACGAACC GTGCATGGGG GACATGTATG AAGGGCCCGC GGGTCGTACC TTTATTTCGT 

2165 2170 2175 2180 2185 2190 2195 2200 2205 2210 2215 2220 
****** 

CCCAGCGCTG CCCTGGGCCC CTGCGAGACT GTGATGGTTC TTTCCACGGG TCAGGCCGAG 
GGGTCGCGAC GGGACCCGGG GACGCTCTGA CACTACCAAG AAAGGTGCCC AGTCCGGCTC 

2225 2230 2235 2240 2245 2250 2255 2260 2265 2270 2275 2280 
****** 

TCTGAGGCCT GAGTGGCATG AGGGAGGCAG AGCGGGTCCC ACTGTCCCCA CACTGGCCCA 
AGACTCCGGA CTCACCGTAC TCCCTCCGTC TCGCCCAGGG TGACAGGGGT GTGACCGGGT 

2285 2290 2295 2300 2305 2310 2315 2320 2325 2330 2335 2340 
****** 

GGCTGTGCAG GTGTGCCTGG GCCGCCTAGG GTGGGGCTCA GCCAGGGGCT GCCCTCGGCA 
CCGACACGTC CACACGGACC CGGCGGATCC CACCCCGAGT CGGTCCCCGA CGGGAGCCGT 

2345 2350 2355 2360 2365 2370 2375 2380 2385 2390 2395 2400 

* * * * * * 

GGGTGGGGGA TTTGCCAGCG TTGCCCTCCC TCCAGCAGCA CCTGCCCTGG GCTGGGCCAC 
CCCACCCCCT AAACGGTCGC AACGGGAGGG AGGTCGTCGT GGACGGGACC CGACCCGGTG 

2405 2410 2415 2420 2425 2430 2435 2440 2445 2450 2455 2460 

* * * * * * 

GGGAAGCCCT AGGAGCCCCT GGGGACAGAC ACACAGCCCC TGCCTCTGTA GGAGACTGTC 
CCCTTCGGGA TCCTCGGGGA CCCCTGTCTG TGTGTCGGGG ACGGAGACAT CCTCTGACAG 

2465 2470 2475 2480 2485 2490 2495 2500 2505 2510 2515 2520 

* * * * * * 

CTGTTCTGTG AGCGCCCTGT CCTCCGACCT CCATGCCCAC TCGGGGGCAT GCCTAGTCCA 
GACAAGACAC TCGCGGGACA GGAGGCTGGA GGTACGGGTG AGCCCCCGTA CGGATCAGGT 

2525 2530 2535 2540 2545 2550 2555 2560 2565 2570 2575 2580 
****** 

TGTGCGTAGG GACAGGCCCT CCCTCACCCA TCTACCCCCA CGGCACTAAC CCCTGGCTGT 
ACACGCATCC CTGTCCGGGA GGGAGTGGGT AGATGGGGGT GCCGTGATTG GGGACCGACA 

2585 2590 2595 2600 2605 2610 2615 2620 2625 2630 2635 2640 
****** 

CCTGCCCAGC CTCGCACCCG CATGGGGACA CAACCGACTC CGGGGACATG CACTCTCGGG 
GGACGGGTCG GAGCGTGGGC GTACCCCTGT GTTGGCTGAG GCCCCTGTAC GTGAGAGCCC 

2645 2650 2655 2660 2665 2670 2675 2680 2685 2690 2695 2700 
****** 

CCCTGTGGAG GGACTGGTGC AGATGCCCAC ACACACACTC AGTCCAGACC CGTTCAACAA 
GGGACACCTC CCTGACCACG TCTACGGGTG TGTGTGTGAG TCAGGTCTGG GCAAGTTGTT 

2705 2710 2715 2720 2725 2730 2735 2740 2745 2750 2755 2760 
****** 

AACCCCCGCA CTGAGGTTGG CCGGCCACAC GGCCACCACA CACACACGTG CACGCCTCAC 
TTGGGGGCGT GACTCCAACC GGCCGGTGTG CCGGTGGTGT GTGTGTGCAC GTGCGGAGTG 
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2765 2770 2775 2780 2785 2790 2795 2800 2805 2810 2815 2820 
****** 

ACACGGAGCC TCACCCGGGC GAACTGCACA GCACCCAGAC CAGAGCAAGG TCCTCGCACA 
TGTGCCTCGG AGTGGGCCCG CTTGACGTGT CGTGGGTCTG GTCTCGTTCC AGGAGCGTGT 

2825 2830 2835 2840 2845 2850 2855 2860 2865 2870 2875 2880 
****** 

CGTGAACACT CCTCGGACAC AGGCCCCCAC GAGCCCCACG CGGCACCTCA AGGCCCACGA 
GCACTTGTGA GGAGCCTGTG TCCGGGGGTG CTCGGGGTGC GCCGTGGAGT TCCGGGTGCT 

2885 2890 2895 2900 2905 2510 2915 2920 2925 2930 2935 2940 
****** 

GCCTCTCGGC AGCTTCTCCA CATGCTGACC TGCTCAGACA AACCCAGCCC TCCTCTCACA 
CGGAGAGCCG TCGAAGAGGT GTACGACTGG ACGAGTCTGT TTGGGTCGGG AGGAGAGTGT 

2945 2950 2955 2960 2965 2970 2975 2980 2985 2990 2995 3000 
****** 

AGGGTGCCCC TGCAGCCGCC ACACACACAC AGGGGATCAC ACACCACGTC ACGTCCCTGG 
TCCCACGGGG ACGTCGGCGG TGTGTGTGTG TCCCCTAGTG TGTGGTGCAG TGCAGGGACC 

3005 3010 3015 3020 3025 3030 3035 3040 3045 3050 3055 3060 
****** 

CCCTGGCCCA CTTCCCAGTG CCGCCCTTCC CTGCAGGGCG GATCATAATC AGCCATACCA 
GGGACCGGGT GAAGGGTCAC GGCGGGAAGG GACGTCCCGC CTAGTATTAG TCGGTATGGT 

3065 3070 3075 3080 3085 3090 3095 3100 3105 3110 3115 3120 
****** 

CATTTGTAGA GGTTTTACTT GCTTTAAAAA ACCTCCCACA CCTCCCCCTG AACCTGAAAC 
GTAAACATCT CCAAAATGAA CGAAATTTTT TGGAGGGTGT GGAGGGGGAC TTGGACTTTG 

3125 3130 3135 3140 3145 3150 3155 3160 3165 3170 3175 3180 
****** 

ATAAAATGAA TGCAATTGTT GTTGTTAACT TGTTTATTGC AGCTTATAAT GGTTACAAAT 
TATTTTACTT ACGTTAACAA CAACAATTGA ACAAATAACG TCGAATATTA CCAATGTTTA 

3185 3190 3195 3200 3205 3210 3215 3220 3225 3230 3235 32*0 
****** 

AAAGCAATAG CATCACAAAT TTCACAAATA AAGCATTTTT TTCACTGCAT TCTAGTTGTG 
TTTCGTTATC GTAGTGTTTA AAGTGTTTAT TTCGTAAAAA AAGTGACGTA AGATCAACAC 

3245 3250 3255 3260 3265 3270 3275 3280 
* * * * 

GTTTGTCCAA ACTCATCAAT GTATCTTATC ATGTCTAGAT CC 
CAAACAGGTT TGAGTAGTTA CATAGAATAG TACAGATCTA GG 



FIG. 27E 



SUBSTITUTE £;;:et i?>'j-_2 



WO 96/02273 



37/5 6 



PCTAJS95/08743 




FIG. 28 



WO 96/02273 PCT/US95/08743 

5 10 15 20 25 3 ^ / 5f 35 40 45 50 55 60 

— -IT** ***** 

TTCATTGATC ATTAATCAGC CATACCACAT TTGTAGAGGT TTTACTTGCT TTAAAAAACC 
AAGTAACTAG TAATTAGTCG GTATGGTGTA AACATCTCCA AAATGAACGA AATTTTTTGG 

65 70 75 80 85 90 95 LOO 105 110 115 120 

* * * * * * 

TCCCACACCT CCCCCTGAAC CTGAAACATA AAATGAATGC AATTGTTGTT GTTAACTTGT 
AGGGTGTGGA GGGGGACTTG GACTTTGTAT TTTACTTACG TTAACAACAA CAATTGAACA 

125 130 135 140 145 150 155 160 165 170 175 180 
***** * 

TTATTGCAGC TTATAATGGT TACAAATAAA GCAATAGCAT CACAAATTTC ACAAATAAAG 
AATAACGTCG AATATTACCA ATGTTTATTT CGTTATCGTA GTGTTTAAAG TGTTTATTTC 

185 190 195 200 205 210 215 220 225 230 235 240 
****** 

CATTTTTTTC ACTGCATTCT AGTTGTGGTT TGTCCAAACT CATCAATGTA TCTTATCATG 
GTAAAAAAAG TGACGTAAGA TCAACACCAA ACAGGTTTGA GTAGTTACAT AGAATAGTAC 

245 250 255 260 265 270 275 280 285 290 295 300 
****** 

TCTGGATCTC TAGCTTCGTG TCAAGGACGG TGACTGCAGT GAATAATAAA ATGTGTGTTT 
AGACCTAGAG ATCGAAGCAC AGTTCCTGCC ACTGACGTCA CTTATTATTT TACACACAAA 

305 310 315 320 325 330 335 340 345 350 355 360 

* * * * * * 

GTCCGAAATA CGCGTTTTGA GATTTCTGTC GCCGACTAAA TTCATGTCGC GCGATAGTGG 
CAGGCTTTAT GCGCAAAACT CTAAAGACAG CGGCTGATTT AAGTACAGCG CGCTATCACC 

365 370 375 380 385 390 395 400 405 410 415 420 
****** 

TGTTTATCGC CGATAGAGAT GGCGATATTG GAAAAATCGA TATTTGAAAA TATGGCATAT 
ACAAATAGCG GCTATCTCTA CCGCTATAAC CTTTTTAGCT ATAAACTTTT ATACCGTATA 

425 430 435 440 445 450 455 460 465 470 475 480 
****** 

TGAAAATGTC GCCGATGTGA GTTTCTGTGT AACTGATATC GCCATTTTTC CAAAAGTGAT 
ACTTTTACAG CGGCTACACT CAAAGACACA TTGACTATAG GCCTAAAAAG GTTTTCACTA 

485 490 495 500 505 510 515 520 525 530 535 540 

* * * * * * 

TTTTGGGCAT ACGCGATATC TGGCGATAGC GCTTATATCG TTTACGGGGG ATGGCGATAG 
AAAACCCGTA TGCGCTATAG ACCGCTATCG CGAATATAGC AAATGCCCCC TACCGCTATC 

545 550 555 560 565 570 575 580 585 590 595 600 
***** * 

ACGACTTTGG TGACTTGGGC GATTCTGTGT GTCGCAAATA TCGCAGTTTC GATATAGGTG 
TGCTGAAACC ACTGAACCCG CTAAGACACA CAGCGTTTAT AGCGTCAAAG CTATATCCAC 

605 610 615 620 625 630 635 640 645 650 655 660 
***** 

ACAGACGATA TGAGGCTATA TCGCCGATAG AGGCGACATC AAGCTGGCAC ATGGCCAAT; 
TGTCTGCTAT ACTCCGATAT AGCGGCTATC TCCGCTGTAG TTCGACCGTG TACCGG77AJ 

665 670 675 680 685 690 695 700 705 710 715 ~: : 
***** 

CATATCGATC TATACATTGA ATCAATATTG GCCATTAGCC ATATTATTCA TTGGT7A7A7 
GTATAGCTAG ATATGTAACT TAGTTATAAC CGGTAATCGG TATAATAAGT AACCAAIA7A 
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725 HQ.** 735 740 745 750 755 760 765 770 775 780 
****** 

AGCATAAATC AATATTGGCT ATTGGCCATT GCATACGTTG TATCCATATC ATAATATGTA 
TCGTATTTAG TTATAACCGA TAACCGGTAA CGTATGCAAC ATAGGTATAG TATTATACAT 

785 790 795 800 805 810 815 820 825 830 835 840 
****** 

CATTTATATT GGCTCATGTC CAACATTACC GCCATGTTGA CATTGATTAT TGACTAGTTA 
GTAAATATAA CCGAGTACAG GTTGTAATGG CGGTACAACT GTAACTAATA ACTGATCAAT 

845 850 855 860 865 870 875 880 885 890 895 900 
****** 

TTAATAGTAA TCAATTAC GG GGTCATTAGT TCATAGCCCA TATATGGAGT TCCGCGTTAC 
AATTATCATT AGTTAATGCC CCAGTAATCA AGTATCGGGT ATATACCTCA AGGCGCAATG 

905 910 915 920 925 930 935 940 945 950 955 960 

* * * * * * 

ATAACTTACG GTAAATGGCC CGCCTGGCTG ACCGCCCAAC GACCCCCGCC CATTGACCTC 
TATTGAATGC CATTTACCGG GCGGACCGAC TGGCGGGTTG CTGGGGGCGG GTAACTGCAG 

965 970 975 980 985 990 995 1000 1005 1010 1015 1020 
****** 

AATAATGACG TATGTTCCCA TAGTAACGCC AATAGGGACT TTCCATTGAC GTCAATGGGT 
TTATTACTGC ATACAAGGGT ATCATTGCGG TTATCCCTGA AAGGTAACTG CAGTTACCCA 

1025 1030 1035 1040 1045 1050 1055 1060 1065 1070 1075 1080 
****** 

GGAGTATTTA CGGTAAACTG CCCACTTGGC AGTACATCAA GTGTATCATA TGCCAAGTAC 
CCTCATAAAT GCCATTTGAC GGGTGAACCG TCATGTAGTT CACATAGTAT ACGGTTCATG 

1085 1090 1095 1100 1105 1110 1115 1120 1125 1130 1135 1U0 
****** 

GCCCCCTATT GACGTCAATG ACGGTAAATG GCCCGCCTGG CATTATGCCC AGTACATGAC 
CGGGGGATAA CTGCAGTTAC TGCCATTTAC CGGGCGGACC GTAATAC GGG TCATGTACTG 

1145 1150 1155 1160 1165 1170 1175 1180 1185 1190 1195 1200 
****** 

CTTATGGGAC TTTCCTACTT GGCAGTACAT CTACGTATTA GTCATCGCTA TTACCATGGT 
GAATACCCTG AAAGGATGAA CCGTCATGTA GATGCATAAT CAGTAGCGAT AATGGTACCA 

1205 1210 1215 1220 1225 1230 1235 1240 1245 1250 1255 Hi': 
*★***- 
GATGCGGTTT TGGCAGTACA TCAATGGGCG TGGATAGCGG TTTGACTCAC GGGGATTTCC 
CTACGCCAAA ACCGTCATGT AGTTACCCGC ACCTATCGCC AAACTGAGTG CCCCTAAAOO 

1265 1270 1275 1280 1285 1290 1295 1300 1305 1310 1315 1;.: 

* * * * * 

AAGTCTCCAC CCCATTGACG TCAATGGGAG TTTGTTTTGG CACCAAAATC AACGGGA;" 
TTCAGAGGTG GGGTAACTGC AGTTACCCTC AAACAAAACC GTGGTTTTAG TTGCCC: ~ 

1325 1330 1335 1340 1345 1350 1355 1360 1365 1370 1375 - - 
****** 
TCCAAAATGT CGTAACAACT CCGCCCCATT GACGCAAATG GGCGGTAGGC GTGTA: 
AGGTTTTACA GCATTGTTGA GGCGGGGTAA CTGCGTTTAC CCGCCATCCG CACAT " 

1385 1390 1395 1400 1405 1410 1415 1420 1425 1430 1435 
***** 
GGAGGTCTAT ATAAGCAGAG CTCGTTTAGT GAACCGTCAG ATCGCCTGGA GACGC /. 
CCTCCAGATA TATTCGTCTC GAGCAAATCA CTTGGCAGTC TAGCGGACCT CTGCCG*.- 
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1445 L45U.-1455 1460 1465 1470 1475 1480 1485 1490 1495 1500 
****** 

ACCCT G TT TT GACCTCCATA GAAGACACCG GGACCGATCC AGCCTCCGCG GCCGGGAACG 
TGCGACAAAA CTGGAGGTAT CTTCTGTGGC CCTGGCTAGG TCGGAGGCGC CGGCCCTTGC 

1505 1510 1515 1520 1525 1530 1535 1540 1545 1550 1555 1560 
****** 

GTGCATTGGA ACGCGGATTC CCCGTGCCAA GAGTGACCTA AGTACCGCCT ATAGAGTCTA 
CACGTAACCT TGCGCCTAAG GGGCACGGTT CTCACTGCAT TCATGGCGGA TATCTCAGAT 

1565 1570 1575 1580 1585 1590 1595 1600 1605 1610 1615 1620 

* * * * * * 

TAGGCCCACC CCCTTGGCTT CTTATGCATG CTATACTGTT TTTGGCTTGG GGTCTATACA 
ATCCGGGTGG GGGAACCGAA GAATACGTAC GATATGACAA AAACCGAACC CCAGATATGT 

1625 1630 1635 1640 1645 1650 1655 1660 1665 1670 1675 1680 
****** 

CCCCCGCTTC CTCATGTTAT AGGTGATGGT ATAGCTTAGC CTATAGGTGT GGGTTATTGA 
GGGGGCGAAG GAGTACAATA TCCACTACCA TATCGAATCG GATATCCACA CCCAATAACT 

1685 1690 1695 1700 1705 1710 1715 1720 1725 1730 1735 1740 
****** 

CCATTATTGA CCACTCCCCT ATTGGTGACG ATACTTTCCA TTACTAATCC ATAACATGGC 
GGTAATAACT GGTGAGGGGA TAACCACTGC TATGAAAGGT AATGATTAGG TATTGTACCG 

1745 1750 1755 1760 1765 1770 1775 1780 1785 1790 1795 1800 

* * . * * * * 

TCTTTGCCAC AACTCTCTTT ATTGGCTATA TGCCAATACA CTGTCCTTCA GAGACTGACA 
AGAAACGGTG TTGAGAGAAA TAACCGATAT ACGGTTATGT GACAGGAAGT CTCTGACTGT 

1805 1810 1815 1820 1825 1830 1835 1840 1845 1850 1855 1860 
****** 

CGGACTCTGT ATTTTTACAG GATGGGGTCT CATTTATTAT TTACAAATTC ACATATACAA 
GCCTGAGACA TAAAAATGTC CTACCCCAGA GTAAATAATA AATGTTTAAG TGTATATGTT 

1865 1870 1875 1880 1885 1890 1895 1900 1905 1910 1915 1920 

* * * * * * 

CACCACCGTC CCCAGTGCCC GCAGTTTTTA TTAAACATAA CGTGGGATCT CCACGCGAAT 
GTGGTGGCAG GGGTCACGGG CGTCAAAAAT AATTTGTATT GCACCCTAGA GGTGCGCTTA 

1925 1930 1935 1940 1945 1950 1955 1960 1965 1970 1975 1980 
****** 

CTCGGGTACG TGTTCCGGAC ATGGGCTCTT CTCCGGTAGC GGCGGAGCTT CTACATCCGA 
GAGCCCATGC ACAAGGCCTG TACCCGAGAA GAGGCCATCG CCGCCTCGAA GATGTAGGCT 

1985 1990 1995 2000 2005 2010 2015 2020 2025 2030 2035 2040 
****** 

GCCCTGCTCC CATGCCTCCA GCGACTCATG GTCGCTCGGC AGCTCCTTGC TCCTAACAGT 
CGGGACGAGG GTACGGAGGT CGCTGAGTAC CAGCGAGCCG TCGAGGAACG AGGATTGTCA 

2045 2050 2055 2060 2065 2070 2075 2080 2085 2090 2095 2100 
****** 

GGAGGCCAGA CTTAGGCACA GCACGATGCC CACCACCACC AGTGTGCCGC ACAAGGCCGT 
CCTCCGGTCT GAATCCGTGT CGTGCTACGG GTGGTGGTGG TCACACGGCG TGTTCCGGCA 

2105 2110 2115 2120 2125 2130 2135 2140 2145 2150 2155 2160 
****** 

GGCGGTAGGG TATGTGTCTG AAAATGAGCT CGGGGAGCGG GCTTGCACCG CTGACGCATT 
CCGCCATCCC ATACACAGAC TTTTACTCGA GCCCCTCGCC CGAACGTGGC GACTGCGTAA 
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2165. 2180 2185 2190 2195 2200 2205 2210 2215 2220 

****** 

TGGAAGACTT AAGGCAGCGG CAGAAGAAGA TGCAGGCAGC TGAGTTGTTG TGTTCTGATA 
ACCTTCTGAA TTCCGTCGCC GTCTTCTTCT ACGTCCGTCG ACTCAACAAC ACAAGACTAT 

2225 2230 2235 2240 2245 2250 2255 2260 2265 2270 2275 2280 

* * * * * * 

AGAGTCAGAG GTAACTCCCG TTGCGGTGCT GTTAACGGTG GAGGGCAGTG TAGTCTGAGC 
TCTCAGTCTC CATTGAGGGC AACGCCACGA CAATTGCCAC CTCCCGTCAC ATCAGACTCG 

2285 2290 2295 2300 2305 2310 2315 2320 2325 2330 2335 2340 

* * * * * * 

AGTACTCGTT GCTGCCGCGC GCGCCACCAG ACATAATAGC TGACAGACTA ACAGACTGTT 
TCATGAGCAA CGACGGCGCG CGCGGTGGTC TGTATTATCG ACTGTCTGAT TGTCTGACAA 

2345 2350 2355 2360 2365 2370 2375 2380 2385 2390 2395 2400 
****** 

CCTTTCCATG GGTCTTTTCT GCAGTCACCG TCCTTGACAC GAAGCTTGGG CTGCAGGTCG 
GGAAAGGTAC CCAGAAAAGA CGTCAGTGGC AGGAACTGTG CTTCGAACCC GACGTCCAGC 

2405 2410 2415 2420 2425 2430 2435 2440 2445 2450 2455 2460 

* * * * * * 

ATCGACTCTA GAGGATCGAT CCCCGGGCGA GCTCGAATTC GCCGCCACCA TGGAATGGAG 
TAGCTGAGAT CTCCTAGCTA GGGGCCCGCT CGAGCTTAAG CGGCGGTGGT ACCTTACCTC 

2465 2470 2475 2480 2485 2490 2495 2500 2505 2510 2515 2520 

* * * * * * 

CTGGGTCTTT CTCTTCTTCC TGTCAGTAAC TACAGGTGTC CACTCCCAGG TTCAGCTGGT 
GACCCAGAAA GAGAAGAAGG ACAGTCATTG ATGTCCACAG GTGAGGGTCC AAGTCGACCA 

2525 2530 2535 2540 2545 2550 2555 2560 2565 2570 2575 2580 
****** 

TCAGTCCGGG GCTGAGGTGA AGAAGCCTGG GGCCTCAGTG AAGGTTTCTT GTCAGGCTTC 
AGTCAGGCCC CGACTCCACT TCTTCGGACC CCGGAGTCAC TTCCAAAGAA CAGTCCGAAG 

2585 2590 2595 2600 2605 2610 2615 2620 2625 2630 2635 2640 
****** 

TGGATACAGA TTCAGTAACT TTGTTATTCA TTGGGTGCGC CAGGCCCCCG GACAGAGGTT 
ACCTATGTCT AAGTCATTGA AACAATAAGT AACCCACGCG GTCCGGGGGC CTGTCTCCAA 

2645 2650 2655 2660 2665 2670 2675 2680 2685 2690 2695 2700 
****** 

TGAGTGGATG GGATGGATCA ATCCTTACAA CGGAAACAAA GAATTTTCAG CGAAGTTCCA 
ACTCACCTAC CCTACCTAGT TAGGAATGTT GCCTTTGTTT CTTAAAAGTC GCTTCAAGGT 

2705 2710 2715 2720 2725 2730 2735 2740 2745 2750 2755 2760 
****** 

GGACAGAGTC ACCTTTACCG CGGACACATC CGCGAACACA GCCTACATGG AGTTGAGGAG 
CCTGTCTCAG TGGAAATGGC GCCTGTGTAG GCGCTTGTGT CGGATGTACC TCAACTCCTC 

2765 2770 2775 2780 2785 2790 2795 2800 2805 2810 2815 2820 
****** 

CCTCAGGTCT GCAGACACGG CTGTTTATTA TTGTGCGAGA GTGGGGCCAT ATAGTTGGGA 
GGAGTCCAGA CGTCTGTGCC GACAAATAAT AACACGCTCT CACCCCGGTA TATCAACCCT 

2825 2830 2835 2840 2845 2850 2855 2860 2865 2870 2875 2880 
****** 

TGATTCTCCC CAGGACAATT ATTATATGGA CGTCTGGGGC AAAGGAACCA CGGTCATCGT 
ACTAAGAGGG GTCCTGTTAA TAATATACCT GCAGACCCCG TTTCCTTGGT GCCAGTAGCA 
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2885 2890 ^2895 2900 2905 2910 2915 2920 2925 2930 2935 2940 
TT ★ * * ★ * 

GAGCTCAGCT TCCACCAAGG GCCCATCGCT CTTCCCCCTG GCACCCTCCT CCAAGAGCAC 
CTCGAGTCGA AGGTGCTTCC CGGGTAGCCA GAAGGGGGAC CGTGGGAGGA GGTTCTCGTG 

2945 2950 2955 2960 2965 2970 2975 2980 2985 2990 2995 3000 

* * * * * * 

CTCTGGGGGC ACAGCGGCCC TGGGCTGCCT GGTCAAGGAC TACTTCCCCG AACCGGTGAC 
GAGACCCCCG TGTCGCCGGG ACCCGACGGA CCAGTTCCTG ATGAAGGGGC TTGGCCACTG 

3005 3010 3015 3020 3025 3030 3035 3040 3045 3050 3055 3060 
****** 

GGTGTCGTGG AACTCAGGCG CCCTGACCAG CGGCGTGCAC ACCTTCCCGG CTGTCCTACA 
CCACAGCACC TTGAGTCCGC GGGACTGGTC GCCGCACGTG TGGAAGGGCC GACAGGATGT 

3065 3070 3075 3080 3085 3090 3095 3100 3105 3110 3115 3120 
****** 

GTCCTCAGGA CTCTACTCCC TCAGCAGCGT GGTGACCGTG CCCTCCAGCA GCTTGGGCAC 
CAGGAGTCCT GAGATGAGGG AGTCGTCGCA CCACTGGCAC GGGAGGTCGT CGAACCCGTG 

3125 3130 3135 3140 3145 3150 3155 3160 3165 3170 3175 3180 

* ***** 

CCAGACCTAC ATCTGCAACG TGAATCACAA GCCCAGCAAC ACCAAGGTGG ACAAGAAAGT 
GGTCTGGATG TAGACGTTGC ACTTAGTGTT CGGGTCGTTG TGGTTCCACC TGTTCTTTCA 

3185 3190 3195 3200 3205 3210 3215 3220 3225 3230 3235 3240 

* * * * * * 

TGGTGAGAGG CCAGCACAGG GAGGGAGGGT GTCTGCTGGA AGCCAGGCTC AGCGCTCCTG 
ACCACTCTCC GGTCGTGTCC CTCCCTCCCA CAGACGACCT TCGGTCCGAG TCGCGAGGAC 

3245 3250 3255 3260 3265 3270 3275 3280 3285 3290 3295 3300 

* ***** 

CCTGGACGCA TCCCGGCTAT GCAGCCCCAG TCCAGGGCAG CAAGGCAGGC CCCGTCTGCC 
GGACCTGCGT AGGGCCGATA CGTCGGGGTC AGGTCCCGTC GTTCCGTCCG GGGCAGACGG 

3305 3310 3315 3320 3325 3330 3335 3340 3345 3350 3355 3360 
****** 

TCTTCACCCG GAGGCCTCTG CCCGCCCCAC TCATGCTCAG GGAGAGGGTC TTCTGGCTTT 
AGAAGTGGGC CTCCGGAGAC GGGCGGGGTG AGTACGAGTC CCTCTCCCAG AAGACCGAAA 

3365 3370 3375 3380 3385 3390 3395 3400 3405 3410 3415 3420 

* * * * * * 

TTCCCCAGGC TCTGGGCAGG CACAGGCTAG GTGCCCCTAA CCCAGGCCCT GCACACAAAG 
AAGGGGTCCG AGACCCGTCC GTGTCCGATC CACGGGGATT GGGTCCGGGA CGTGTGTTTC 

3425 3430 3435 3440 3445 3450 3455 3460 3465 3470 3475 3480 

* * * * * * 

GGGCAGGTGC TGGGCTCAGA CCTGCCAAGA GCCATATCCG GGAGGACCCT GCCCCTGACC 
CCCGTCCACG ACCCGAGTCT GGACGGTTCT CGGTATAGGC CCTCCTGGGA CGGGGACTGG 

3485 3490 3495 3500 3505 3510 3515 3520 3525 3530 3535 3540 
****** 

TAAGCCCACC CCAAAGGCCA AACTCTCCAC TCCCTCAGCT CGGACACCTT CTCTCCTCCC 
ATTCGGGTGG GGTTTCCGGT TTGAGAGGTG AGGGAGTCGA GCCTGTGGAA GAGAGGAGGG 

3545 3550 3555 3560 3565 3570 3575 3580 3585 3590 3595 3600 

* * * * * * 

AGATTCGAGT AACTCCCAAT CTTCTCTCTG CAGAGCCCAA ATCTTGTGAC AAAACTCACA 
TCTAAGCTCA TTGAGGGTTA GAAGAGAGAC GTCTCGGGTT TAGAACACTG TTTTGAGTGT 
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3605 3610 3615 3620 3625 3630 3635 3640 3645 3650 3655 3660 

* * * * * 

CATGCCCACC GTGCCCAGGT AAGCCAGCCC AGGCCTCGCC CTCCAGCTCA AGGCGGGACA 
GTACGGGTGG CACGGGTCCA TTCGGTCGGG TCCGGAGCGG GAGGTCGAGT TCCGCCCTGT 

3665 3670 3675 3680 3685 3690 3695 3700 3705 3710 3715 3720 
****** 

GGTGCCCTAG AGTAGCCTGC ATCCAGGGAC AGGCCCCAGC CGGGTGCTGA CACGTCCACC 
CCACGGGATC TCATCGGACG TAGGTCCCTG TCCGGGGTCG GCCCACGACT GTGCAGGTGG 

3725 3730 3735 3740 3745 3750 3755 3760 3765 3770 3775 3780 
****** 

TCCATCTCTC CCTCAGCACC TGAGGCCGCG GGAGGACCAT CAGTCTTCCT CTTCCCCCCA 
AGGTAGAGAG GGAGTCGTGG ACTCCGGCGC CCTCCTGGTA GTCAGAAGGA GAAGGGGGGT 

3785 3790 3795 3800 3805 3810 3815 3820 3825 3830 3835 3840 

* * * * * * 

AAACCCAAGG ACACCCTCAT GATCTCCCGG ACCCCTGAGG TCACATGCGT GGTGGTGGAC 
TTTGGGTTCC TGTGGGAGTA CTAGAGGGCC TGGGGACTCC AGTGTACGCA CCACCACCTG 

3845 3850 3855 3860 3865 3870 3875 3880 3885 3890 3895 3900 
****** 

GTGAGCCACG AAGACCCTGA GGTCAAGTTC AACTGGTACG TGGACGGCGT GGAGGTGCAT 
CACTCGGTGC TTCTGGGACT CCAGTTCAAG TTGACCATGC ACCTGCCGCA CCTCCACGTA 

3905 3910 3915 3920 3925 3930 3935 3940 3945 3950 3955 3960 

* * * * * * 

AATGCCAAGA CAAAGCCGCG GGAGGAGCAG TACAACAGCA CGTACCGTGT GGTCAGCGTC 
TTACGGTTCT GTTTCGGCGC CCTCCTCGTC ATGTTGTCGT GCATGGCACA CCAGTCGCAG 

3965 3970 3975 3980 3985 3990 3995 4000 4005 4010 4015 4020 

* * * * * * 

CTCACCGTCC TGCACCAGGA CTGGCTGAAT GGCAAGGAGT ACAAGTGCAA GGTCTCCAAC 
GAGTGGCAGG ACGTGGTCCT GACCGACTTA CCGTTCCTCA TGTTCACGTT CCAGAGGTTG 

4025 4030 4035 4040 4045 4050 4055 4060 4065 4070 4075 4080 
****** 

AAAGCCCTCC CAGCCCCCAT CGAGAAAACC ATCTCCAAAG CCAAAGGTGG GACCCGTGGG 
TTTCGGGAGG GTCGGGGGTA GCTCTTTTGG TAGAGGTTTC GGTTTCCACC CTGGGCACCC 

4085 4090 4095 4100 4105 4110 4115 4120 4125 4130 4135 4140 
****** 

GTGCGAGGGC CACATGGACA GAGGCCGGCT CGGCCCACCC TCTGCCCTGA GAGTGACCGC 
CACGCTCCCG GTGTACCTGT CTCCGGCCGA GCCGGGTGGG AGACGGGACT CTCACTGGCG 

4145 4150 4155 4160 4165 4170 4175 4180 4185 4190 4195 4200 
****** 

TGTACCAACC TCTGTCCCTA CAGGGCAGCC CCGAGAACCA CAGGTGTACA CCCTGCCCCC 
ACATGGTTGG AGACAGGGAT GTCCCGTCGG GGCTCTTGGT GTCCACATGT GGGACGGGGG 

4205 4210 4215 4220 4225 4230 4235 4240 4245 4250 4255 4260 
****** 

ATCCCGGGAT GAGCTGACCA AGAACCAGGT CAGCCTGACC TGCCTGGTCA AAGGCTTCTA 
TAGGGCCCTA CTCGACTGGT TCTTGGTCCA GTCGGACTGG ACGGACCAGT TTCCGAAGAT 

4265 4270 4275 4280 4285 4290 4295 4300 4305 4310 4315 4320 
****** 

TCCCAGCGAC ATCGCCGTGG AGTGGGAGAG CAATGGGCAG CCGGAGAACA ACTACAAGAC 
AGGGTCGCTG TAGCGGCACC TCACCCTCTC GTTACCCGTC GGCCTCTTGT TGATGTTCTG 
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4325 4330 4335 4340 4345 4350 4355 4360 4365 4370 4375 4380 

* * * * * 

CACGCCTCCC GTGCTGGACT CCGACGGCTC CTTCTTCCTC TACAGCAAGC TCACCGTGGA 
GTGCGGAGGG CACGACCTGA GGCTGCCGAG GAAGAAGGAG ATGTCGTTCG AGTGGCACCT 

4385 4390 4395 4400 4405 4410 4415 5520 4425 4430 4435 4440 
****** 

CAAGAGCAGG TGGCAGCAGG GGAACGTCTT CTCATGCTCC GTGATGCATG AGGCTCTGCA 
GTTCTCGTCC ACCGTCGTCC CCTTGCAGAA GAGTACGAGG CACTACGTAC TCCGAGACGT 

4445 4450 4455 4460 4465 4470 4475 4480 4485 4490 4495 4500 

* * * * * * 

CAACCACTAC ACGCAGAAGA GCCTCTCCCT GTCTCCGGGT AAATGAGTGC GACGGCCGGC 
GTTGGTGATG TGCGTCTTCT CGGAGAGGGA CAGAGGCCCA TTTACTCACG CTGCCGGCCG 

4505 4510 4515 4520 4525 4530 4535 4540 4545 4550 4555 4560 
****** 

AAGCCCCCGC TCCCCGGGCT CTCGCGGTCG CACGAGGATG CTTGGCACGT ACCCCCTGTA 
TTCGGGGGCG AGGGGCCCGA GAGCGCCAGC GTGCTCCTAC GAACCGTGCA TGGGGGACAT 

4565 4570 4575 4580 4585 4590 4595 4600 4605 4610 4615 4620 
****** 

CATACTTCCC GGGCGCCCAG CATGGAAATA AAGCACCCAG CGCTGCCCTG CGCCCCTGCG 
GTATGAAGGG CCCGCGGGTC GTACCTTTAT TTCGTGGGTC GCGACGGGAC CCGGGGACGC 

4625 4630 4635 4640 4645 4650 4655 4660 4665 4670 4675 4680 

* * * * * * 

AGACTGTGAT GGTTCTTTCC ACGGGTCAGG CCGAGTCTGA GGCCTGAGTG GCATGAGGCA 
TCTGACACTA CCAAGAAAGG TGCCCAGTCC GGCTCAGACT CCGGACTCAC CGTACTCCCT 

4685 4690 4695 4700 4705 4710 4715 4720 4725 4730 4735 47<*0 
****** 

GGCAGAGCGG GTCCCACTGT CCCCACACTG GCCCAGGCTG TGCAGGTGTG CCTGGGCCGC 
CCGTCTCGCC CAGGGTGACA GGGGTGTGAC CGGGTCCGAC ACGTCCACAC GGACCCGGCG 

4745 4750 4755 4760 4765 4770 4775 4780 4785 4790 4795 4800 
****** 

CTAGGGTGGG GCTCAGCCAG GGGCTGCCCT CGGCAGGGTG GGGGATTTGC CAGCGTTGCC 
GATCCCACCC CGAGTCGGTC CCCGACGGGA GCCGTCCCAC CCCCTAAACG GTCGCAACCG 

4805 4810 4815 5820 4825 4830 4835 4840 4845 4850 4855 4860 

* * * * * w 

CTCCCTCCAG CAGCACCTGC CCTGGGCTGG GCCACGGGAA GCCCTAGGAG CCCCTGGGCA 
GAGGGAGGTC GTCGTGGACG GGACCCGACC CGGTGCCCTT CGGGATCCTC GGGGACCCC * 

4865 4870 4875 4880 4885 4890 4895 4900 4905 4910 4915 
***** 

CAGACACACA GCCCCTGCCT CTGTAGGAGA CTGTCCTGTT CTGTGAGCGC CCTGTCCT. 
GTCTGTGTGT CGGGGACGGA GACATCCTCT GACAGGACAA GACACTCGCG GGACACCA 

4925 4930 4935 4940 4945 4950 4955 4960 4965 4970 4975 

* * * * * 
GACCTCCATG CCCACTCGGG GGCATGCCTA GTCCATGTGC GTAGGGACAG GCCCTC 
CTGGAGGTAC GGGTGAGCCC CCGTACGGAT CAGGTACACG CATCCCTGTC CGGGAo ■ 

4985 4990 4995 5000 5005 5010 5015 5020 5025 5030 5035 
***** 

ACCCATCTAC CCCCACGGCA CTAACCCCTG GCTGTCCTGC CCAGCCTCGC ACCCC'A* 
TGGGTAGATG GGGGTGCCGT GATTGGGGAC CGACAGGACG GGTCGGAGCG TGGGCCTa 
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5045 5050 5055 5060 5065 5070 5075 5080 5085 5090 5095 5100 
^ * * * * * 

GGACACAACC GACTCCGGGG ACATGCACTC TCGGGCCCTG TGGAGGGACT CGTGCAGATG 
CCTGTGTTGG CTGAGGCCCC TGTACGTGAG AGCCCGGGAC ACCTCCCTGA CCACGTCTAC 

5105 5110 5115 5120 5125 5130 5135 5140 5145 5150 5155 5160 

* * * * * . * 

CCCACACACA CACTCAGTCC AGACCCGTTC AACAAAACCC CCGCACTGAG GTTGGCCGGC 
GGGTGTGTGT GTGAGTCAGG TCTGGGCAAG TTGTTTTGGG GGCGTGACTC CAACCGGCCG 

5165 5170 5175 5180 5185 5190 5195 5200 5205 5210 5215 5220 
****** 

CACACGGCCA CCACACACAC ACGTGCACGC CTCACACACC GAGCCTCACC CGGGCGAACT 
GTGTGCCGGT GGTGTGTGTG TGCACGTGCG GAGTGTGTGC CTCGGAGTGG GCCCGCTTGA 

5225 5230 5235 5240 5245 5250 5255 5260 5265 5270 5275 5280 
****** 

GCACAGCACC CAGACCAGAG CAAGGTCCTC GCACACGTGA ACACTCCTCG GACACAGGCC 
CGTGTCGTGG GTCTGGTCTC GTTCCAGGAG CGTGTGCACT TGTGAGGAGC CTGTGTCCGG 

5285 5290 5295 5300 5305 5310 5315 5320 5325 5330 5335 5340 
****** 

CCCACGAGCC CCACGCGGCA CCTCAAGGCC CACGAGCCTC TCGGCAGCTT CTCCACATGC 
GGGTGCTCGG GGTGCGCCGT GGAGTTCCGG GTGCTCGGAG AGCCGTCGAA GAGGTGTACG 

5345 5350 5355 5360 5365 5370 5375 5380 5385 5390 5395 5400 

* * * * * * 

TGACCTGCTC AGACAAACCC AGCCCTCCTC TCACAAGGGT GCCCCTGCAG CCGCCACACA 
ACTGGACGAG TCTGTTTGGG TCGGGAGGAG AGTGTTCCCA CGGGGACGTC GGCGGTGTGT 

5405 5410 5415 5420 5425 5430 5435 5440 5445 5450 5455 5460 
****** 

CACACAGGGG ATCACACACC ACGTCACGTC CCTGGCCCTG GCCCACTTCC CAGTGCCGCC 
GTGTGTCCCC TAGTGTGTGG TGCAGTGCAG GGACCGGGAC CGGGTGAAGG GTCACGGCGG 

5465 5470 5475 5480 5485 5490 5495 5500 5505 5510 5515 5520 
****** 

CTTCCCTGCA GGGCGGATCA TAATCAGCCA TACCACATTT GTAGAGGTTT TACTTGCTTT 
GAAGGGACGT CCCGCCTAGT ATTAGTCGGT ATCGTGTAAA CATCTCCAAA ATGAACGAAA 

5525 5530 5535 5540 5545 5550 5555 5560 5565 5570 5575 5580 
****** 

AAAAAACCTC CCACACCTCC CCCTGAACCT GAAACATAAA ATGAATGCAA TTGTTGTTGT 
TTTTTTGGAG GGTGTGGAGG GGGACTTGGA CTTTGTATTT TACTTACCTT AACAACAACA 

5585 5590 5595 5600 5605 5610 5615 5620 5625 5630 5635 5640 

* * * * * * 

TAACTTGTTT ATTGCAGCTT ATAATGGTTA CAAATAAAGC AATAGCATCA CAAATTTCAC 
ATTGAACAAA TAACGTCGAA TATTACCAAT GTTTATTTCG TTATCGTAGT GTTTAAAGTG 

5645 5650 5655 5660 5665 5670 5675 5680 5685 5690 5695 5700 
****** 

AAATAAAGCA TTTTTTTCAC TGCATTCTAG TTGTGGTTTG TCCAAACTCA TCAATGTATC 
TTTATTTCGT AAAAAAAGTG ACGTAAGATC AACACCAAAC AGGTTTGAGT AGTTACATAG 

5750 5710 5715 5720 5725 5730 5735 5740 5745 5750 5755 5760 

* * * * * * 

TTATCATGTC TGAGATCCTC TACGCCGGAC GCATCGTGGC CGGCATCACC GGCGCCACAG 
AATAGTACAG ACTCTAGGAG ATGCGGCCTG CGTAGCACCG GCCGTAGTGG CCGCGGTGTC 
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5765 5770 5775 5780 5785 5790 5795 5800 5805 5810 5815 5820 

*r0> ***** 

CTGCGGTTGC TGGCGCCTAT ATCGCCGACA TCACCGATGG GGAAGATCGG GCTCCCCACT 
CACGCCAACG ACCGCGGATA TAGCGGCTGT AGTGGCTACC CCTTCTAGCC CGAGCGGTGA 

5825 5830 5835 5840 5845 5850 5855 5860 5865 5870 5875 5880 
****** 
TCGGGCTCAT GAGCGCTTGT TTCGGCGTGG GTATGGTGGC AGGCCCGTGG CCGGGGGACT 
AGCCCGAGTA CTCGCGAACA AAGCCGCACC CATACCACCG TCCGGGCACC GGCCCCCTGA 

5885 5890 5895 5900 5905 5910 5915 5920 5925 5930 5935 5940 
****** 
GTTGGGCGCC ATCTCCTTGC ATGCACCATT CCTTGCGGCG GCGGTGCTCA ACGGCCTCAA 
CAACCCGCGG TAGAGGAACG TACGTGGTAA GGAACGCCGC CGCCACGAGT TGCCGGAGTT 

5945 5950 5955 5960 5965 5970 5975 5980 5985 5990 5990 6000 
****** 

CCTACTACTG GGCTGCTTCC TAATGCAGGA GTCGCATAAG GGAGAGCGTC GACCTCGGGC 
GGATGATGAC CCGACGAAGG ATTACGTCCT CAGCGTATTC CCTCTCGCAG CTGGAGCCCG 

6005 6010 6015 6020 6025 6030 6035 6040 6045 6050 6055 6060 
****** 
CGCGTTGCTG GCGTTTTTCC ATAGGCTCCG CCCCCCTGAC GAGCATCACA AAAA TCGACG 
GCGCAACGAC CGCAAAAAGG TATCCGAGGC GGGGGGACTG CTCGTAGTGT TTTTAGCTGC 

6065 6070 6075 6080 6085 6090 6095 6100 6105 6110 6115 6120 

* * * * * . * 

CTCAAGTCAG AGGTGGCGAA ACCCGACAGG ACTATAAAGA TACCAGGCGT TTCCCCCTGG 
GAGTTCAGTC TCCACCGCTT TGGGCTGTCC TGATATTTCT ATGGTCCGCA AAGGGGGAGC 

6125 6130 6135 6140 6145 6150 6155 6160 6165 6170 6175 6180 
****** 
AAGCTCCCTC GTGCGCTCTC CTGTTCCGAC CCTGCCGCTT ACCGGATACC TGTCCGCCTT 
TTCGAGGGAG CACGCGAGAG GACAAGGCTG GGACGGCGAA TGGCCTATGG ACAGGCGGAA 

6185 6190 6195 6200 6205 6210 6215 6220 6225 6230 6235 6240 

* * * * * * 
TCTCCCTTCG GGAAGCGTGG CGCTTTCTCA ATGCTCACGC TGTAGGTATC TCAGTTCGGT 
AGAGGGAAGC CCTTCGCACC GCGAAAGAGT TACGAGTGCG ACATCCATAG AGTCAAGCCA 

6245 6250 6255 6260 6265 6270 6275 6280 6285 6290 6295 6300 

* * * * * * 
GTAGGTCGTT CGCTCCAAGC TGGGCTGTGT GCACGAACCC CCCGTTCAGC CCGACCGCTG 
CATCCAGCAA GCGAGGTTCG ACCCGACACA CGTGCTTGGG GGGCAAGTCG GGCTGGCGAC 

6305 6310 6315 6320 6325 6330 6335 6340 6345 6350 6355 6360 
****** 
CGCCTTATCC GGTAACTATC GTCTTGAGTC CAACCCGGTA AGACACGACT TATCGCCACT 
GCGGAATAGG CCATTGATAG CAGAACTCAG GTTGGGCCAT TCTGTGCTGA ATAGCGGTGA 

6365 6370 6375 6380 6385 6390 6395 6400 6405 6410 6415 6420 
***** * 
GGCAGCAGCC ACTGGTAACA GGATTAGCAG AGCGAGGTAT GTAGGCGGTG CTACAGAGTT 
CCGTCGTCGG TGACCATTGT CCTAATCGTC TCGCTCCATA CATCCGCCAC GATGTCTCAA 

6425 6430 6435 6440 6445 6450 6455 6460 6465 6470 6475 6480 
****** 
CTTGAAGTGG TGGCCTAACT ACGGCTACAC TAGAAGGACA GTATTTGGTA TCTGCGCTCT 
GAACTTCACC ACCGGATTGA TGCCGATGTG ATCTTCCTGT CATAAACCAT AGACGCGAGA 
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6485 6*iaA-*6495 6500 6505 6510 6515 6520 6525 6530 6535 6540 
****** 

GCTGAAGCCA GTTACCTTCG GAAAAAGAGT TGGTAGCTCT TGATCCGGCA AACAAACCAC 
CGACTTCGGT CAATGGAAGC CTTTTTCTCA ACCATCGAGA ACTAGGCCGT TTGTTTGGTG 

6545 6550 6555 6560 6565 6570 6575 6580 6585 6590 6595 6600 
****** 

CGCTGGTAGC GGTGGTTTTT TTGTTTGCAA GCAGCAGATT ACGCGCAGAA AAAAAGGATC 
GCGACCATCG CCACCAAAAA AACAAACGTT CGTCGTCTAA TGCGCGTCTT TTTTTCCTAG 

6605 6610 6615 6620 6625 6630 6635 6640 6645 6650 6655 6660 
****** 

TCAAGAAGAT CCTTTGATCT TTTCTACGGG GTCTGACGCT CAGTGGAACG AAAACTCACG 
AGTTCTTCTA GGAAACTAGA AAAGATGCCC CAGACTGCGA GTCACCTTGC TTTTGAGTGC 

6665 6670 6675 6680 6685 6690 6695 6700 6705 6710 6715 6720 
****** 

TTAAGGGATT TTGGTCATGA GATTATCAAA AAGGATCTTC ACCTAGATCC TTTTAAATTA 
AATTCCCTAA AACCAGTACT CTAATAGTTT TTCCTAGAAG TGGATCTAGG AAAATTTAAT 

6725 6730 6735 6740 6745 6750 6755 6760 6765 6770 6775 6780 
****** 

AAAATGAAGT TTTAAATCAA TCTAAAGTAT ATATGAGTAA ACTTGGTCTG ACAGTTACCA 
TTTTACTTCA AAATTTAGTT AGATTTCATA TATACTCATT TGAACCAGAC TGTCAATGGT 

6785 6790 6795 6800 6805 6810 6815 6820 6825 6830 6835 6l840 

* * * * * * 

ATGCTTAATC AGTGAGGCAC CTATCTCAGC GATCTGTCTA TTTCGTTCAT CCATAGTTGC 
TACGAATTAG TCACTCCGTG GATAGAGTCG CTAGACAGAT AAAGCAAGTA GGTATCAACG 

6845 6850 6855 6860 6865 6870 6875 6880 6885 6890 6895 6900 
****** 

CTGACTCCCC GTCGTGTAGA TAACTACGAT ACGGGAGGGC TTACCATCTG GCCCCAGTGC 
GACTGAGGGG CAGCACATCT ATTGATGCTA TGCCCTCCCG AATGGTAGAC CGGGGTCACG 

6905 6910 6915 6920 6925 6930 6935 6940 6945 6950 6955 6960 
****** 

TGCAATGATA CCGCGAGACC CACGCTCACC GGCTCCAGAT TTATCAGCAA TAAACCAGCC 
ACGTTACTAT GGCGCTCTGG GTGCGAGTGG CCGAGGTCTA AATAGTCGTT ATTTGGTCGG 

6965 6970 6975 6980 6985 6990 6995 7000 7005 7010 7015 7020 
****** 

AGCCGGAAGG GCCGAGCGCA GAAGTGGTCC TGCAACTTTA TCCGCCTCCA TCCAGTCTAT 
TCGGCCTTCC CGGCTCGCGT CTTCACCAGG ACGTTGAAAT AGGCGGAGGT AGGTCAGATA 

7025 7030 7035 7040 7045 7050 7055 7060 7065 7070 7075 7080 

* * * * * * 

TAATTGTTGC CGGGAAGCTA GAGTAAGTAG TTCGCCAGTT AATAGTTTGC GCAACGTTGT 
ATTAACAACG GCCCTTCGAT CTCATTCATC AAGCGGTCAA TTATCAAACG CGTTGCAACA 

7085 7090 7095 7100 7105 7110 7115 7120 7125 7130 7135 7140 
****** 

TGCCATTGCT ACAGGCATCG TGGTGTCACG CTCGTCGTTT GGTATGGCTT CATTCAGCTC 
ACGGTAACGA TGTCCGTAGC ACCACAGTGC GAGCAGCAAA CCATACCGAA GTAAGTCGAG 

7145 7150 7155 7160 7165 7170 7175 7180 7185 7190 7195 7200 

* * * * * * 

CGGTTCCCAA CGATCAAGGC GAGTTACATG ATCCCCCATG TTGTGCAAAA AAGCGGTTAG 
GCCAAGGGTT GCTAGTTCCG CTCAATGTAC TAGGGGGTAC AACACGTTTT TTCGCCAATC 
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7205 7210 7215 7220 7225 7230 7235 7240 7245 7250 7255 7260 
— ar*" ***** 

CTCCTTCGGT CCTCCGATCG TTGTCAGAAG TAAGTTGGCC GCAGTGTTAT CACTCATGGT 
GAGGAAGCCA GGAGGCTAGC AACAGTCTTC ATTCAACCGG CGTCACAATA GTGAGTACCA 

7265 7270 7275 7280 7285 7290 7295 7300 7305 7310 7315 7320 
****** 

TATGGCAGCA CTGCATAATT CTCTTACTGT CATGCCATCC GTAAGATGCT TTTCTGTGAC 
ATACCGTCGT GACGTATTAA GAGAATGACA GTACGGTAGG CATTCTACGA AAAGACACTG 

7325 7330 7335 7340 7345 7350 7355 7360 7365 7370 7375 7380 
****** 

TGGTGAGTAC TCAACCAAGT CATTCTGAGA ATAGTGTATG CGGCGACCGA GTTGCTCTTG 
ACCACTCATG AGTTGGTTCA GTAAGACTCT TATCACATAC GCCGCTGGCT CAACGAGAAC 

7385 7390 7395 7400 7405 7410 7415 7420 7425 7430 7435 7440 
****** 

CCCGGCGTCA ACACGGGATA ATACCGCGCC ACATAGCAGA ACTTTAAAAG TGCTCATCAT 
GGGCCGCAGT TGTGCCCTAT TATGGCGCGG TGTATCGTCT TGAAATTTTC ACGAGTAGTA 

7445 7450 7455 7460 7465 7470 7475 7480 7485 7490 7495 7500 

* * * * * ★ 

TGGAAAACGT TCTTCGGGGC GAAAACTCTC AAGGATCTTA CCGCTGTTGA GATCCAGTTC 
ACCTTTTGCA AGAAGCCCCG CTTTTGAGAG TTCCTAGAAT GGCGACAACT CTAGGTCAAG 

7505 7510 7515 7520 7525 7530 7535 7540 7545 7550 7555 7560 

* * * * * - * 

GATGTAACCC ACTCGTGCAC CCAACTGATC TTCAGCATCT TTTACTTTCA CCAGCGTTTC 
CTACATTGGG TGAGCACGTG GGTTGACTAG AAGTCGTAGA AAATGAAAGT GGTCGCAAAG 

7565 7570 7575 7580 7585 7590 7595 7600 7605 7610 7615 7620 
****** 

TGGGTGAGCA AAAACAGGAA GGCAAAATGC CGCAAAAAAG GGAATAAGGG CGACACGGAA 
ACCCACTCGT TTTTGTCCTT CCGTTTTACG GCGTTTTTTC CCTTATTCCC GCTGTGCCTT 

7625 7630 7635 7640 7645 7650 7655 7660 7665 7670 7675 7680 
****** 

ATGTTGAATA CTCATACTCT TCCTTTTTCA ATATTATTGA AGCATTTATC AGGGTTATTG 
TACAACTTAT GAGTATGAGA AGGAAAAAGT TATAATAACT TCGTAAATAG TCCCAATAAC 

7685 7690 7695 7700 7705 7710 7715 7720 7725 7730 7735 7740 

* * * * * * 

TCTCATGAGC GGATACATAT TTGAATGTAT TTAGAAAAAT AAACAAATAG GGGTTCCGCG 
AGAGTACTCG CCTATGTATA AACTTACATA AATCTTTTTA TTTGTTTATC CCCAAGGCGC 

7745 7750 7755 7760 7765 7770 7775 7780 7785 7790 7795 7800 

* * * * * * 

CACATTTCCC CGAAAAGTGC CACCTGACGT CTAAGAAACC ATTATTATCA TGACATTAAC 
GTGTAAAGGG GCTTTTCACG GTGGACTGCA GATTCTTTGG TAATAATAGT ACTGTAATTG 

7805 7810 7815 7820 7825 7830 7835 7840 7845 7850 7855 7860 
****** 

CTATAAAAAT AGGCGTATCA CGAGGCCCTG ATGGCTCTTT GCGGCACCCA TCGTTCGTAA 
GATATTTTTA TCCGCATAGT GCTCCGGGAC TACCGAGAAA CGCCGTGGGT AGCAAGCATT 

7865 7870 7875 7880 7885 7890 7895 7900 7905 7910 7915 7920 

* * * * * * 

TGTTCCGTGG CACCGAGGAC AACCCTCAAG AGAAAATGTA ATCACACTGG CTCACCT7CG 
ACAAGGCACC GTGGCTCCTG TTGGGAGTTC TCTTTTACAT TAGTGTGACC GAGTGGAACC 
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7925 7930 7935 7940 7945 7950 7955 7960 7965 7970 7975 7980 
— * * * * * 

GGTGGGCCTT TCTGCGTTTA TAAGGAGACA CTTTATGTTT AAGAAGGTTG GTAAATTCCT 
CCACCCGGAA AGACGCAAAT ATTCCTCTGT GAAATACAAA TTCTTCCAAC CATTTAAGGA 

7985 7990 7995 8000 8005 8010 8015 8020 8025 8030 8035 8040 
****** 

TGCGGCTTTG GCAGCCAAGC TAGATCCGGC TGTGGAATGT GTGTCAGTTA GGGTGTGGAA 
ACGCCGAAAC CGTCGGTTCG ATCTAGGCCG ACACCTTACA CACAGTCAAT CCCACACCTT 

8045 8050 8055 8060 8065 8070 8075 8080 8085 8090 8095 8100 
****** 

AGTCCCCAGG CTCCCCAGCA GGCAGAAGTA TGCAAAGCAT GCATCTCAAT TAGTCAGCAA 
TCAGGGGTCC GAGGGGTCGT CCGTCTTCAT ACGTTTCGTA CGTAGAGTTA ATCAGTCGTT 

8105 8110 8115 8120 8125 8130 8135 8140 8145 8150 8155 8160 
****** 

CCAGGCTCCC CAGCAGGCAG AACTATGCAA AGCATGCATC TCAATTAGTC AGCAACCATA 
GGTCCGAGGG GTCGTCCGTC TTCATACGTT TCGTACGTAG AGTTAATCAG TCGTTGGTAT 

8165 8170 8175 8180 8185 8190 8195 8200 8205 8210 8215 8220 
****** 

GTCCCGCCCC TAACTCCGCC CATCCCGCCC CTAACTCCGC CCAGTTCCGC CCATTCTCCG 
GAGGGCGGGG ATTGAGGCGG GTAGGGCGGG GATTGAGGCG GGTCAAGGCG GGTAAGAGGC 

8225 8230 8235 8240 8245 8250 8255 8260 8265 8270 8275 8280 
****** 

CCCCATGGCT GACTAATTTT TTTTATTTAT GCAGAGGCCG AGGCCGCCTC GGCCTCTGAG 
GGGGTACCGA CTGATTAAAA AAAATAAATA CGTCTCCGGC TCCGGCGGAG CCGGAGACTC 

8285 8290 8295 8300 8305 8310 8315 8320 8325 8330 8335 8340 
****** 

CTATTCCAGA AGTAGTGAGG AGGCTTTTTT GGAGGCCTAG GCTTTTGCAA AAACTAGCTT 
GATAAGGTCT TCATCACTCC TCCGAAAAAA CCTCCGGATC CGAAAACGTT TTTGATCGAA 

8345 8350 8355 8360 8365 8370 8375 8380 8385 8390 8395 8400 
****** 

GGGGCCACCG CTCAGAGCAC CTTCCACCAT GGCCACCTCA GCAAGTTCCC ACTTGAACAA 
CCCCGGTGGC GAGTCTCGTG GAAGGTGGTA CCGGTGGAGT CGTTCAAGGG TGAACTTGT7 

8405 8410 8415 8420 8425 8430 8435 8440 8445 8550 8455 8460 
* * * * * * 

AAACATCAAG CAAATGTACT TGTGCCTGCC CCAGGGTGAG AAAGTCCAAG CCATGTATAT 
TTTGTAGTTC GTTTACATGA ACACGGACGG GGTCCCACTC TTTCAGGTTC GGTACATATA 

8465 8470 8475 8480 8485 8490 8495 8500 8505 8510 8515 8520 
****** 

CTGGGTTGAT GGTACTGGAG AAGGACTCCG CTGCAAAACC CGCACCCTGG ACTGTGAGCC 
GACCCAACTA CCATGACCTC TTCCTGACGC GACGTTTTGG GCGTGGGACC TGACACTCJJ 

8525 8530 8535 8540 8545 8550 8555 8560 8565 8570 8575 85i 
*★***- 

CAAGTGTGTA GAAGAGTTAC CTGAGTGGAA TTTTGATGGC TCTAGTACCT TTCAGTC: ^ 
GTTCACACAT CTTCTCAATG GACTCACCTT AAAACTACCG AGATCATGGA AAGTCAJA * 

8585 8690 8595 9600 8605 8610 8615 8620 8625 8630 8635 • 
****** 

GGGCTCCAAC AGTGACATGT ATCTCAGCCC TGTTGCCATG TTTCGGGACC CCTTCC: 
CCCGAGGTTG TCACTGTACA TAGAGTC GGG ACAACGCTAC AAAGCCCTGG GGAAGGJ.;: 
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8645 8650 8655 8660 8665 8670 8675 8680 8685 8690 8695 8700 

*— * * * * * 

AGATCCCAAC AAGCTGGTGT TCTGTGAAGT TTTCAAGTAC AACCGGAAGC CTGCAGAGAC 
TCTAGGGTTG TTCGACCACA AGACACTTCA AAAGTTCATG TTGGCCTTCG GACGTCTCTG 

8705 8710 8715 8720 8725 8730 8735 8740 8745 8750 8755 8760 
****** 

*CAATTTAAGG CACTCGTGTA AACGGATAAT GGACATGGTG AGCAACCAGC ACCCCTGGTT 
GTTAAATTCC GTGAGCACAT TTGCCTATTA CCTGTACCAC TCGTTGGTCG TGGGGACCAA 

8765 8770 8775 8780 8785 8790 8795 8800 8805 8810 8815 8820 
****** 

TGGAATGGAA CAGGAGTATA CTCTGATGGG AACAGATGGG CACCCTTTTG GTTGGCCTTC 
ACCTTACCTT GTCCTCATAT GAGACTACCC TTGTCTACCC GTGGGAAAAC CAACCGGAAG 

8825 8830 8835 8840 8845 8850 8855 8860 8865 8870 8875 8880 

* ***** 

CAATGGCTTT CCTGGGCCCC AAGGTCCGTA TTACTCTGGT GTGGGCGCAG ACAAAGCCTA 
GTTACCGAAA GGACCCGGGG TTCCAGGCAT AATGACACCA CACCCGCGTC TGTTTCGGAT 

8885 8890 8895 8900 8905 8910 8915 8920 8925 8930 8935 8940 
****** 

TGGCAGGGAT ATCGTGGAGG CTCACTACCG CGCCTGCTTG TATGCTGGGG TCAAGATTAC 
ACCGTCCCTA TAGCACGTCC GAGTGATGGC GCGGACGAAC ATACGACCCC AGTTCTAATG 

8945 8950 8955 8960 8965 8970 8975 8980 8985 8990 8995 9000 

* * * * * . * 

AGGAACAAAT GCTGAGGTCA TGCCTGCCCA GTGGGAACTC CAAATAGGAC CCTGTGAAGG 
TCCTTGTTTA CGACTCCAGT ACGGACGGGT CACCCTTGAG GTTTATCCTG GGACACTTCC 

9005 9010 9015 9020 9025 9030 9035 9040 9045 9050 9055 9060 

* * * * * * 

AATCCGCATG GGAGATCATC TCTGGGTGGC CCGTTTCATC TTNCATCGAG TATGTGAAGA 
TTAGGCGTAC CCTCTAGTAG AGACCCACCG GGCAAACTAG AANGTAGCTC ATACACTTCT 

9065 9070 9075 9080 9085 9090 9095 9100 9105 9110 9115 9120 
****** 

CTTTGGGGTA ATAGCAACCT TTGACCCCAA GCCCATTCCT GGGAACTGGA ATGGTGCAGG 
GAAACCCCAT TATCGTTGGA AACTGGGGTT CGGGTAAGGA CCCTTGACCT TACCACGTCC 

9125 9130 9135 9140 9145 9150 9155 9160 9165 9170 9175 9180 
****** 

CTGCCATACC AACTTTAGCA CCAAGGCCAT GCGGGAGGAG AATGGTCTGA AGCACATCGA 
GACGGTATGG TTGAAATCGT GGTTCCGGTA CGCCCTCCTC TTACCAGACT TCGTGTAGCT 

9185 9190 9195 9200 9205 9210 9215 9220 9225 9230 9235 9240 

* * * * * * 

GGAGGCCATC GAGAAACTAA GCAAGCGGCA CCGGTACCAC ATTCGAGCCT ACGATGCCAA 
CCTCCGGTAG CTCTTTGATT CGTTCGCCGT GGCCATGGTG TAAGCTCGGA TGCTAGGGTT 

9245 9250 9255 9260 9265 9270 9275 9280 9285 9290 9295 9300 
****** 

GGGGGGCCTG GACAATGCCC GTGGTCTGAC TGGGTTCCAC GAAACGTCCA ACATCAACGA 
CCCCCCGGAC CTGTTACGGG CACCAGACTG ACCCAAGGTG CTTTGCAGGT TGTAGTTGCT 

9305 9310 9315 9320 9325 9330 9335 9340 9345 9350 9355 9360 
****** 

CTTTTCTGCT GGTGTCGCCA ATCGCAGTGC CAGCATCCGC ATTCCCCGGA CTGTCGGCCA 
GAAAAGACGA CCACAGCGGT TAGCGTCACG GTCGTAGGCG TAAGGGGCCT GACAGCCGGT 
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9365 9370 9375 9380 9385 9390 9395 9400 9405 9410 9415 9420 
— * * * * * 

GGAGAAGAAA GGTTACTTTG AAGACCGCGG CCCCTCTGCC AATTGTGACC CCTTTGCAGT 
CCTCTTCTTT CCAATGAAAC TTCTGGCGCC GGGGAGACGG TTAACACTGG GGAAACGTCA 

9425 9430 9435 9440 9445 9450 9455 9460 9465 9470 9475 9480 
****** 

GACAGAAGCC ATCGTCCGCA CATGCCTTCT CAATGAGACT GGCCACGAGC CCTTCCAATA 
CTGTCTTCGG TAGCAGGCGT GTACGGAAGA GTTACTCTGA CCGGTGCTCG GGAAGGTTAT 

9485 9490 9495 9500 9505 9510 9515 9520 9525 9530 9535 9540 

* * . * * * * 

CAAAA ACTAA TTAGACTTTG AGTGATCTTG AGCCTTTCCT AGTTCATCCC ACCCCGCCCC 
GTTTTTGATT AATCTGAAAC TCACTAGAAC TCGGAAAGGA TCAAGTAGGG TGGGGCGGGG 

9545 9550 9555 9560 9565 9570 9575 9580 9585 9590 9595 9600 

* * * * * * 

AGAGAGATCT TTGTGAAGGA ACCTTACTTC TGTGGTGTGA CATAATTGGA CAAACTACCT 
TCTCTCTAGA AACACTTCCT TGGAATGAAG ACACCACACT GTATTAACCT GTTTGATGGA 

9605 9610 9615 9620 9625 9630 9635 9640 9645 9650 9655 9660 
****** 

ACAGAGATTT AAAGCTCTAA GGTAAATATA AAATTTTTAA GTGTATAATG TGTTAAACTA 
TGTCTCTAAA TTTCGAGATT CCATTTATAT TTTAAAAATT CACATATTAC ACAATTTGAT 

9665 9670 9675 9680 9685 9690 9695 9700 9705 9710 9715 9720 

* * * * * - * 

CTGATTCTAA TTGTTTGTGT ATTTTAGATT CCAACCTATG GAACTGATGA ATGGGAGCAG 
GACTAACATT AACAAACACA TAAAATCTAA GGTTGCATAC CTTGACTACT TACCCTCGTC 

9725 9730 9735 9740 9745 9750 9755 9760 9765 9770 9775 9780 
****** 

TGGTGGAATG CCTTTAATGA GGAAAACCTG TTTTGCTCAG AAGAAATGCC ATCTAGTGAT 
ACCACCTTAC GGAAATTACT CCTTTTGGAC AAAACGAGTC TTCTTTACGG TAGATCACTA 

9785 9790 9795 9800 9805 9810 9815 9820 9825 9830 9835 9840 
****** 

GATGAGGCTA CTGCTGACTC TCAACATTCT ACTCCTCCAA AAAAGAAGAG AAAGGTAGAA 
CTACTCCGAT GACGACTGAG AGTTGTAAGA TGAGGAGGTT TTTTCTTCTC TTTCCATCTT 

9845 9850 9855 9860 9865 9870 9875 9880 9885 9890 9895 9900 
****** 

GACCCCAAGG ACTTTCCTTC AGAATTGCTA AGTTTTTTGA GTCATGCTGT GTTTACTAAT 
CTGGGGTTCC TGAAAGGAAG TCTTAACGAT TCAAAAAACT CAGTACGACA CAAATCATTA 

9905 9910 9915 9920 9925 9930 9935 9940 9945 9950 9955 9960 

* * * * * * 

AGAACTCTTG CTTGCTTTGC TATTTACACC ACAAAGGAAA AAGCTGCACT GCTATACAAG 
TCTTGAGAAC GAACGAAACG ATAAATGTGG TGTTTCCTTT TTCGACGTGA CGATATGTTC 

9965 9970 9975 9980 9985 9990 999610000 1000510010 1001510020 
****** 

AAAATTATGG AAAAATATTC TGTAACCTTT ATAAGTAGGC ATAACAGTTA TAATCATAAC 
TTTTAATACC TTT7TATAAG ACATTGGAAA TATTCATCCG TATTGTCAAT ATTAGTATTG 

1002510030 1003510040 1004510050 1005510060 1006510070 1007510080 
****** 

ATACTGTTTT TTCTTACTCC ACACAGGCAT AGAGTGTCTG CTATTAATAA CTATGCTCAA 
TATGACAAAA AAGAATGAGG TGTGTCCGTA TCTCACAGAC GATAATTATT GATACGAGTT 
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1008510090 1009510100 1010510110 1011510120 1012510130 1013510140 
3T * * * * * 

AAATTGTGTA CCTTTAGCTT TTTAATTTCT AAAGGGGTTA ATAAGGAATA TTTGATGTAT 
TTTAACACAT GGAAATCGAA AAATTAAACA TTTCCCCAAT TATTCCTTAT AAACTACATA 

1014510150 1015510160 1016510170 1017510180 1018510190 1019510200 

* * * * * * 

AGTGCCTTGA CTAGAGATCA TAATCAGCCA TACCACATTT GTAGAGGTTT TACTTGCTTT 
TCACGGAACT GATCTCTAGT ATTAGTCGGT ATGGTCTAAA CATCTCCAAA ATGAACGAAA 

1020510210 1021510220 1022510230 1023510240 1024510250 1025510260 

* * * * * * 

AAAAAACCTC CCACACCTCC CCCTGAACCT GAAACATAAA ATGAATGCAA TTGTTGTTGT 
TTTTTTGGAG GGTGTGGAGG GGGACTTGGA CTTTGTATTT TACTTACGTT AACAACAACA 

1026510270 1027510280 1028510290 1029510300 1030510310 1031510320 

* ★ * * * * 

TAACTTGTTT ATTGCAGCTT ATAATGGTTA CAAATAAAGC AATAGCATCA CAAATTTCAC 
ATTGAACAAA TAACGTCGAA TATTACCAAT GTTTATTTCG TTATCGTAGT GTTTAAAGTG 

1032510330 1033510340 1034510350 1035510360 1036510370 1037510380 

* * * * * * 

AAATAAAGCA T TTTT TTCAC TGCATTCTAG TTGTGGTTTG TCCAAACTCA TCAATGTATC 
TTTATTTCGT AAAAAAAGTG ACGTAAGATC AACACCAAAC AGGTTTGAGT AGTTACATAG 

1038510390 1039510400 1040510410 1041510420 1042510430 1043510440 

* * * * * • * 

TTATCATGTC TGGATCTCTA GCTTCGTGTC AAGGACGGTG ACTGCAGTGA ATAATAAAAT 
AATAGTACAG ACCTAGAGAT CGAAGCACAG TTCCTGCCAC TGACGTCACT TATTATTTTA 

1044510450 1045510460 1046510470 1047510480 1048510490 1049510500 
****** 

GTGTGTTTGT CCGAAATACG CGTTTTGAGA TTTCTGTCGC CGACTAAATT CATGTCGCGC 
CACACAAACA GGCTTTATGC GCAAAACTCT AAAGACAGCG GCTGATTTAA GTACAGCGCG 

1050510510 1051510520 1052510530 1053510540 1054510550 1055510560 

* * * * * * 

GATAGTGGTG TTTATCGCCG ATAGAGATGG CGATATTGGA AAAATCGATA TTTGAAAATA 
CTATCACCAC AAATAGC GGC TATCTCTACC GCTATAACCT TTTTAGCTAT AAACTTTTAT 

1056510570 1057510580 1058510590 1059510600 1060510610 1061510620 

* * * * * * 

TGGCATATTG AAAATGTCGC CGATGTGAGT TTCTGTGTAA CTGATATCGC CATTTTTCCA 
ACCGTATAAC TTTTACAGCG GCTACACTCA AAGACACATT GACTATAGCG GTAAAAAGGT 

1062510630 1063510640 1064510650 1065510660 1066510670 1067510680 

* * * * * * 

AAAGTGATTT TTGGGCATAC GCGATATCTG GCGATAGCGC TTATATCGTT TACGGGGGAT 
TTTCACTAAA AACCCGTATG CGCTATAGAC CGCTATCGCG AATATAGCAA ATGCCCCCTA 

1068510690 1069510700 1070510710 1071510720 1072510730 1073510740 

* * * * * * 

GGCGATAGAC GACTTTGGTG ACTTGGGCGA TTCTGTGTGT CGCAAATATC GCAGTTTCGA 
CCGCTATCTG CTGAAACCAC TGAACCCGCT AAGACACACA GCGTTTATAG CGTCAAAGCT 

1074510750 1075510760 1076510770 1077510780 1078510790 1079510800 

* * * * * * 

TATAGGTGAC AGACGATATG AGGCTATATC GCCGATAGAG GCGACATCAA GCTGGCACAT 
ATATCCACTG TCTGCTATAC TCCGATATAG CGGCTATCTC CGCTGTAGTT CGACCGTGTA 
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1080510810 1081510820 1082510830 1083510840 1084510850 1085510860 
_ *_ ^ * * * * * 

GGCCAATCCA TATCGATCTA TACATTGAAT CAATATTGGC CATTAGCCAT ATTATTCATT 
CCGGTTACGT ATAGCTAGAT ATGTAACTTA GTTATAACCG GTAATCGGTA TAATAAGTAA 

1086510870 1087510880 1088510890 1089510900 1090510910 1091510920 
****** 

GCTTATATAG CATAAATCAA TATTGGCTAT TGGCCATTCC ATACGTTGTA TCCATATCAT 
CCAATATATC GTATTTAGTT ATAACCGATA ACCGGTAACG TATGCAACAT AGGTATAGTA 

1092510930 1093510940 1094510950 1095510960 1096510970 1097510980 
****** 

AATATGTACA TTTATATTGG CTCATGTCCA ACATTACCGC CATGTTGACA TTGATTATTG 
TTATACATCT AAATATAACC GAGTACAGGT TGTAATGGCG GTACAACTGT AACTAATAAC 

1098510990 1099511000 1100511010 1101511020 1102511030 1103511040 

* * * * * * 

ACTAGTTATT AATAGTAATC AATTACGGGG TCATTAGTTC ATAGCCCATA TATGGAGTTC 
TGATCAATAA TTATCATTAG TTAATGCCCC AGTAATCAAG TATCGGGTAT ATACCTCAAG 

1104511050 1105511060 1106511070 1107511080 1108511090 1109511100 
****** 

CCCGTTACAT AACTTACGGT AAATGGCCCG CCTGGCTGAC CGCCCAACGA CCCCCGCCCA 
GCGCAATGTA TTGAATGCCA TTTACCGGGC GGACCGACTG GCGGGTTGCT GGGGGCGGGT 

1110511110 1111511120 1112511130 1113511140 1114511150 1115511160 
****** 

TTGACGTCAA TAATGACGTA TGTTCCCATA GTAACGCCAA TAGGGACTTT CCATTGACGT 
AACTGCAGTT ATTACTGCAT ACAAGGGTAT CATTGCGGTT ATCCCTGAAA GGTAACTGCA 

1116511170 1117511180 1118511190 1119511220 ' 1120511210 1121511220 
****** 

CAATGGGTGG AGTATTTACG GTAAACTGCC CACTTGGCAG TACATCAAGT GTATCATATG 
GTTACCCACC TCATAAATGC CATTTGACGG GTGAACCGTC ATGTAGTTCA CATAGTATAC 

1122511230 1123511240 1124511250 1125511260 1126511270 1127511280 
****** 

CCAAGTACGC CCCCTATTGA CGTCAATGAC GGTAAATGGC CCGCCTGGCA TTATGCCCAG 
GGTTCATGCG GGGGATAACT GCAGTTACTG CCATTTACCG GGCGGACCGT AATACGGGTC 

1128511290 1129511300 1130511310 1131511320 1132511330 1133511340 

* * * * * * 

TACATGACCT TATGGGACTT TCCTACTTGG CAGTACATCT ACGTATTAGT CATCGCTATT 
ATGTACTGGA ATACCCTGAA AGGATGAACC GTCATGTAGA TGCATAATCA GTAGCGATAA 

1134511350 1135511360 1136511370 1137511380 1138511390 1139511400 
****** 

ACCATGGTGA TGCGGTTTTG GCAGTACATC AATGGGCGTG GATAGCGGTT TGACTCACGG 
TGGTACCACT ACGCCAAAAC CGTCATGTAG TTACCCGCAC CTATCGCCAA ACTGAGTGCC 

1140511410 1141511420 1142511430 1143511440 1144511450 1145511460 
****** 

GGATTTCCAA GTCTCCACCC CATTGACGTC AATGGGAGTT TGTTTTGGCA CCAAAATCAA 
CCTAAAGGTT CAGAGGTGGG GTAACTGCAG TTACCCTCAA ACAAAACCGT GGTTTTACTT 

1146511470 1147511480 1148511490 1149511500 1150511510 1151511520 
****** 

CGGGACTTTC CAAAATGTCG TAACAACTCC GCCCCATTGA CGCAAATGGG CGGTAGGCGT 
GCCCTGAAAG GTTTTACAGC ATTGTTGAGG CGGGGTAACT GCGTTTACCC GCCATCCGCA 
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1152511530 1153511540 1154511550 1155511560 1156511570 1157511580 

* * * * * 

GTACGGTGGG AGCTCTATAT AAGCAGAGCT CGTTTAGTGA ACCGTCAGAT CGCCTGGAGA 
CATGCCACCC TCCAGATATA TTCGTCTCGA GCAAATCACT TGGCAGTCTA GCGGACCTCT 

1158511590 1159511600 1160511610 1161511620 1162511630 1163511640 
****** 

CGCCATCCAC GCTGTTTTGA CCTCCATAGA AGACACCGGG ACCGATCCAG CCTCCGCGGC 
GCGGTAGGTG CGACAAAACT GGAGGTATCT TCTGTGGCCC TGGCTAGGTC GGAGGCGCCG 

1164511650 1165511660 1166511670 1167511680 1168511690 1169511700 
****** 

CGGGAACGGT GCATTGGAAC GCGGATTCCC CGTGCCAAGA GTGACGTAAG TACCGCCTAT 
GCCCTTGCCA CGTAACCTTG CGCCTAAGGG GCACGGTTCT CACTGCATTC ATGGCGGATA 

1170511710 1171511720 1172511730 1173511740 1174511750 1175511760 
****** 

AGAGTCTATA GGCCCACCCC CTTGCCTTCT TATGCATGCT ATACTGTTTT TGGCTTGGGG 
TCTCAGATAT CCGGGTGGGG GAACCGAAGA ATACGTACGA TATGACAAAA ACCGAACCCC 

1176511770 1177511780 1178511790 1179511800 1180511810 1181511820 
****** 

TCTATACACC CCCGCTTCCT CATGTTATAG GTGATGGTAT AGCTTAGCCT ATAGGTGTGG 
AGATATGTGG GGGCGAAGGA GTACAATATC CACTACCATA TCGAATCGGA TATCCACACC 

1182511830 1183511840 1184511850 1185511860 1186511870 1187511880 

* * * * * . * 

GTTATTGACC ATTATTGACC ACTCCCCTAT TGGTGACGAT ACTTTCCATT ACTAATCCAT 
CAATAACTGG TAATAACTGG TGAGGGGATA ACCACTGCTA TGAAAGGTAA TGATTAGGTA 

1188511890 1189511900 1190511910 1191511920 1192511930 1193511940 

* * * * * * 

AACATGGCTC TTTGCCACAA CTCTCTTTAT TGGCTATATG CCAATACACT GTCCTTCAGA 
TTGTACCGAG AAACGGTGTT GAGAGAAATA ACCGATATAC GGTTATGTGA CAGGAAGTCT 

1194511950 1195511960 1196511970 1197511980 1198511990 1199512000 
****** 

GACTGACACG GACTCTGTAT TTTTACAGGA TGGGGTCTCA TTTATTATTT ACAAATTCAC 
CTGACTGTGC CTGAGACATA AAAATGTCCT ACCCCAGAGT AAATAATAAA TGTTTAAGTC 

1200512010 1201512020 1202512030 1203512040 1204512050 1205512060 
****** 

ATATACAACA CCACCCTCCC CAGTGCCCGC AC T T TTTA TT AAACATAACG TGGGATCTCC 
TATATGTTGT GGTGGCAGGG GTCACGGGCG TCAAAAATAA TTTGTATTGC ACCCTACACG 

1206512070 1207512080 1208512090 1209512100 1210512110 1211512::: 

* * * * * * 

ACGCGAATCT CGGGTACGTG TTCCGGACAT GGGCTCTTCT CCGGTAGCGG CGGAGCTT:? 
TGCGCTTAGA GCCCATGCAC AAGGCCTGTA CCCGAGAAGA GGCCATCGCC GCCTCGAA/A 

1212512130 1213512140 1214512150 1215512160 1216512170 121751:: : 
***** 

ACATCCGAGC CCTGCTCCCA TGCCTCCAGC GACTCATGGT CGCTCGGCAG CTCCTTJ :? * 
TGTAGGCTCG GGACGAGGGT ACGGAGGTCG CTGAGTACCA GCGAGCCGTC GAGGAA a - 

1218512190 1219512200 1220512210 1221512220 1222512230 12235:.. 
***** 

CTAACAGTGG AGGCCAGACT TAGGCACAGC ACGATGCCCA CCACCACCAG TGTGCC 
GATTGTCACC TCCGGTCTGA ATCCGTGTCG TGCTACGGGT GGTGGTGGTC ACACGG*: : 
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122451225JLU25512260 1226512270 1227512280 1228512290 1229512300 
*T ★ * * * * 

AAGGCCGTGG CGGTAGGGTA TGTGTCTGAA AATGAGCTCG GGGAGCGGGC TTGCACCGCT 
TTCCGGCACC GCCATCCCAT ACACAGACTT TTACTCGAGC CCCTCGCCCG AACGTGGCGA 

1230512310 1231512320 1232512330 1233512340 1234512350 1235512360 

* * * * * * 

GACGCATTTG GAAGACTTAA GGCAGCGGCA GAAGAAGATG CAGGCAGCTG AGTTGTTGTG 
CTGCGTAAAC CTTCTGAATT CCGTCGCCGT CTTCTTCTAC GTCCGTCGAC TCAACAACAC 

1236512370 1237512380 1238512390 1239512400 1240412410 1241512420 

* * * * * * 

TTCTGATAAG AGTCAGAGGT AACTCCCGTT GCGGTGCTGT TAACGGTGGA GGGCAGTGTA 
AAGACTATTC TCAGTCTCCA TTGAGGGCAA CGCCACGACA ATTGCCACCT CCCGTCACAT 

1242512430 1243512440 1244512450 1245512460 1246512470 1247512480 
****** 

GTCTGAGCAG TACTCGTTGC TGCCGCGCGC GCCACCAGAC ATAATAGCTG ACAGACTAAC 
CAGACTCGTC ATGAGCAACG ACGGCGCGCG CGGTGGTCTG TATTATCGAC TGTCTGATTG 

1248512490 1249512500 1250512510 1251512520 1252512530 1253512540 

* * * * * * 

AGACTGTTCC TTTCCATGGG TCTTTTCTGC AGTCACCGTC CTTGACACGA AGCTTACCAT 
TCTGACAAGG AAAGGTACCC AGAAAAGACG TCAGTGGCAG GAACTGTGCT TCGAATGGTA 

1254512550 1255512560 1256512570 1257512580 1258512590 1259512600 

* * * * * * * 

GGGTGTGCCC ACTCAGGTCC TGGGGTTGCT GCTGCTGTGG CTTACAGATG CCAGATGTGA 
CCCACACGGG TGAGTCCAGG ACCCCAACGA CGACGACACC GAATGTCTAC GGTCTACACT 

1260512610 1261512620 1262512630 1263512640 1264512650 1265512660 
****** 

GATCGTTCTC ACGCAGTCTC CAGGCACCCT GTCTCTGTCT CCAGGGGAAA GAGCCACCTT 
CTAGCAAGAG TGCGTCAGAG GTCCGTGGGA CAGAGACAGA GGTCCCCTTT CTCGGTGGAA 

1266512670 1267512680 1268512690 1269512700 1270512710 1271512720 
****** 

CTCCTGTAGG TCCAGTCACA GCATTCGCAG CCGCCGCGTA GCCTGGTACC AGCACAAACC 
GAGGACATCC AGGTCAGTGT CGTAAGCGTC GGCGGCGCAT CGGACCATGG TCGTGTTTGG 

1272512730 1273512740 1274512750 1275512760 1276512770 1277512780 

* ^ * * * * 

TGGCCAGGCT CCAAGGCTGG TCATACATGG TGTTTCCAAT AGGGCCTCTG GCATCTCAGA 
ACCGGTCCGA GGTTCCGACC AGTATGTACC ACAAAGGTTA TCCCGGAGAC CGTAGAGTCT 

1278512790 1279512800 1280512810 1281512820 1282512830 1283512840 

* * * * * * 

CAGGTTCAGC GGCAGTGGGT CTGGGACAGA CTTCACTCTC ACCATCACCA GAGTGGAGCC 
GTCCAAGTCG CCGTCACCCA GACCCTGTCT GAAGTGAGAG TGGTAGTGGT CTCACCTCGG 

1284512850 1285512860 1286512870 1287512880 1288512890 1289512900 
****** 

TGAAGACTTT GCACTGTACT ACTGTCAGGT CTATGGTGCC TCCTCGTACA CTTTTGGCCA 
ACTTCTGAAA CGTGACATGA TGACAGTCCA GATACCACGG AGGAGCATGT GAAAACCGGT 

1290512910 1291512920 1292512930 1293512940 1294512950 1295512960 
****** 
GGGGACCAAA CTGGAGAGGA AACGAACTGT GCCTGCACCA TCTGTCTTCA TCTTCCCGCC 
CCCCTGGTTT GACCTCTCCT TTGCTTGACA CGGACGTGGT AGACAGAAGT AGAAGGGCGG 
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1296512970 1297512980 1298512990 1299513000 1300513010 1301513020 
****** 

ATCTGATGAG CAGTTGAAAT CTGGGACTGC CTCTGTTGTG TGCCTGCTGA ATAACTTCTA 
TAGACTACTC GTCAACTTTA GACCCTGACG GAGACAACAC ACGGACGACT TATTGAAGAT 

1302513030 1303513040 1304513050 1305513060 1306513070 1307513080 
****** 

TCCCAGAGAG GCCAAAGTAC AGTGGAAGGT GGATAACGCC CTCCAATCGG GTAACTCCCA 
AGGGTCTCTC CGGTTTCATG TCACCTTCCA CCTATTGCGG GAGGTTAGCC CATTGAGGGT 

1308513090 1309513100 1310513110 1311513120 1312513130 1313513140 
****** 

GGAGAGTGTC ACAGAGCAGG ACAGCAAGGA CAGCACCTAC AGCCTCAGCA GCACCCTGAC 
CCTCTCACAG TGTCTCGTCC TGTCGTTCCT GTCGTGGATG TCGGAGTCGT CGTGGGACTG 

1314513150 1315513160 1316513170 1317513180 1318513190 1319513200 
****** 

GCTGAGCAAA GCAGACTACG AGAAACACAA AGTCTACGCC TGCGAAGTCA CCCATCAGGG 
CGACTCGTTT CGTCTGATCC TCTTTGTGTT TCAGATGCGG ACGCTTCAGT GGGTAGTCCC 

1320513210 1321513220 1322513230 1323513240 1324513250 13255 

***** 
CCTGAGATCG CCCGTCACAA AGAGCTTCAA CAGGGGAGAG TGTTAATTCT AGAGAA 
GGACTCTAGC GGGCAGTGTT TCTCGAAGTT GTCCCCTCTC ACAATTAAGA TCTCTT 
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